# Synchronization of Clocked Field-Coupled Circuits

F. Sill Torres<sup>1,2</sup>, M. Walter<sup>1</sup>, R. Wille<sup>3</sup>, D. Große<sup>1,2</sup> and R. Drechsler<sup>1,2</sup>

<sup>1</sup>Group of Computer Architecture, University of Bremen, Germany, email: frasillt@uni-bremen.de

<sup>2</sup>Cyber-Physical Systems, DFKI GmbH, Bremen, Germany, <sup>3</sup>Inst. of Integrated Circuits, Johannes Kepler University Linz, Austria

*Abstract*—Proper synchronization in clocked *Field-Coupled Nanocomputing* (FCN) circuits is a fundamental problem. In this work, we show for the first time that global synchronicity is not a mandatory requirement in clocked FCN designs and discuss the considerable restrictions that global synchronicity presents for sequential and large-scale designs. Furthermore, we propose a solution that circumvents design restrictions due to synchronization requirements.

# I. INTRODUCTION

*Field-Coupled Nanocomputing* (FCN) offers a promising alternative to conventional circuit technologies. In FCN, computations and data transfer is realized via local fields between nanoscale devices that are arranged in patterned arrays [1]. Theoretical and experimental results indicate that FCN-based approaches have the potential to allow for systems with highest processing performance and remarkable low energy dissipation [2]. Consequently, numerous contributions on their physical realization have been made in the past, e.g. molecular quantum cellular automata (mQCA) [3], atomic quantum cellular automata (aQCA) [4] or nanomagnetic logic (NML) [5].

Clocked FCN circuits apply external clocks in order to circumvent the issue of metastability and to control the data flow. In case of mQCA and aQCA techniques, electric clocks control the tunneling within a cell, while in NML a magnetic clock controls the switching ability of the nanomagnets. Depending on the technology, each device or cell changes during a complete clock cycle between four (mQCA, aQCA) or three (NML) different phases, i.e. a switch, a hold, a reset and a neutral phase (the latter only in case of mQCA and aQCA). For the sake of simplicity and without loss of generality, we will consider a four-phase technology in the following.

In case of four phases, usually four different clocks, numbered from 1 to 4, are applied, whereby each clock controls a selected set of cells. For fabrication purposes, cells are usually grouped in a grid of square-shaped tiles such that all cells within a tile are controlled by the same external clock [6]. All four clocks have a phase difference of 90 degrees. It is important to note that correct data flow is only possible between cells controlled by consecutively numbered clocks. That means, cells controlled by clock 1 can solely pass its data to cells controlled by clock 2 etc. and, finally, from clock 4 to clock 1. Hence, there is a local synchronization of signals located in neighboring tiles, and the data flow between tiles is conducted in a pipeline-like fashion controlled by the external clocks.

This behavior leads to the common assumption that clocked FCN circuits must not employ only a local but also global pipeline-like behavior. That means, it is assumed that all signal paths arriving at the same logic gates must have equal length and that all signals must always arrive at the respective logic gates in a synchronized manner.

For small combinational circuits, this so-called *global synchronicity* (GS) can easily be guaranteed. However, for large-scale as well as sequential designs, GS poses a considerable design restriction (as discussed in Section II). Since scalability and sequential behavior are prerequisites for practically relevant applications of FCN, this poses a serious threat to the further development of this technology which has not been considered yet.

In this work, we, to the best of our knowledge, for the first time, address this problem. We show that GS is not a mandatory requirement in clocked FCN circuits and, furthermore, propose a simple but effective solution that enables the synchronization of circuits violating the GS constraint. In order to apply this solution in more complex circuits, we introduce a latch-like structure that uses external clocks for signal synchronization (see Section III). Finally, simulation results presented in Section IV indicate the feasibility of the proposed approach.

## II. GLOBAL SYNCHRONICITY OF FCN CIRCUITS

## A. GS in Combinational Circuits

A fundamental characteristic of globally synchronized designs is that, in each clock cycle, new data can be applied to the primary inputs of the circuit. After the first input data passed the circuit, correspondingly new results arrive at the circuit's primary outputs in each clock cycle – resulting in a circuit throughput of 1. Furthermore, a globally synchronized circuit does not require synchronization elements like latches as, by definition, all related data are always synchronized.

However, in contrast to many related statements in the literature [7, 8], GS is not a mandatory constraint in clocked FCN circuits. For example, the circuit depicted in Fig. 1 fulfills the local synchronization requirement, i.e. data is only passed between tiles controlled by consecutively numbered clocks. However, the paths between primary inputs *PI1* and *PI2* and operation *o3* differ in its length by more than 3 tiles. Thus, data sent at the same time from *PI1* and *PI2* arrive in different clock cycles at *o3* and, consequently, GS is not given. A common



Fig. 1. FCN circuit failing the global synchronicity. The red line indicates the limit until where *PI1* could be placed such that paths *PI1\rightarrowo3* and *PI2\rightarrowo3* are synchronous.



Fig. 2. Sequential circuit failing GS. The red lines indicate the path that should have a maximum length of 4 tiles.

solution would be the relocation of *PI1* or *PI2* such that paths have equal lengths. However, this usually comes at very high costs in terms of area.

Instead, we propose to reduce the frequency with which new input data are applied. That means for the given example, data connected at *PI1* and *PI2* must be kept stable for two clock cycles – leading to a reduced throughput of 1/2. On the other hand, this approach allows for the reduction of area costs and design complexity.

### B. GS in Sequential Circuits

The problem of GS represents itself in more restricting manner in sequential circuits, as e.g. shown in Fig. 2. Here, the output data of flip-flop ffI must arrive within one clock cycle at the input of ffI in order to assure a correct operation. Consequently, the physical path length between output and input of ffI must be less or equal to 4 tiles. However, due to the given configuration of the circuit, this path length is not achievable, preventing the global synchronicity of this circuit.

In order to circumvent this problem, we propose again to hold the data at *PII* for the number of clock cycles required to assure GS for the circuit, i.e. 2 clock cycles in the given example (more details will be given in the full version).

#### **III.** ARTIFICIAL LATCH

As stated above, circuits that completely fulfill the GS constraint do not require any latches or flip-flop elements. In contrast, circuits that fail to comply with the GS constraint and, consequently, are required to hold data, have the need of latches and/or flip-flop devices.

Having in mind the routing overhead of an additional control signal for latches and/or flip-flops, we propose the use of an additional external memory clock, similar to an idea presented in [9]. This clock, which we call *clock* M, is configured such that it can receive data from cells clocked by *clock* 4 and pass data to cells controlled by *clock* 2. Moreover, the clock can be configured such that it holds data over several



Fig. 3. Memory clock M reproducing latch-like behavior

| Circuit          | Area Gain | Throughput |
|------------------|-----------|------------|
| 4:1 MUX          | 8 %       | 1/2        |
| Parity Generator | 43 %      | 1/2        |
| ISCAS85 c17      | 30 %      | 1/3        |

Table 1. Comparison of QCA circuits implemented with and without GS

clock cycles. That means, the clock phase in which data are hold can be extended. Consequently, this clock enables the implementation of a wire that has a latch-like behavior.

If this clock is applied by all required latches of the design, then the hold time must be equal to the longest time data have to be hold in the design in order to guarantee synchronicity. Alternatively, several memory clocks with different hold times can be used.

#### **IV. TRADEOFF-ANALYSIS**

In order to analyze the possible tradeoff between area and throughput by ignoring the GS constraint in FCN circuits, we implemented an automatic layout tool for clocked QCA-like circuits. This tool generates the exact solution for the smallest layout for a given circuit (more details will be given in the full version). We implemented and verified several circuits in two versions—one fulfilling the GS constraint and one violating it. Next, we compared both versions in terms of area and reduced throughput due to the requirement of holding data. Table 1 lists the related results which indicate that ignoring GS can lead to considerable reduction of area costs, due to the lack of synchronization wires. On the downside, this comes at the cost of throughput.

In summary, the results clearly show that GS is not a mandatory constraint. Further, one can conclude that especially for sequential and large-scale circuits, ignoring GS might be fundamental for enabling the feasibility of these designs.

#### REFERENCES

- N. G. Anderson and S. Bhanja, *Field-coupled Nanocomputing:* Paradigms, Progress, and Perspectives, 1st ed. New York: Springer, 2014
- [2] J. Timler and C. S. Lent, "Power gain and dissipation in quantum-dot cellular automata," *J Appl Phys*, vol. 91, pp. 823-831, 2002.
- [3] V. Arima, M. Iurlo, L. Zoli, S. Kumar, M. Piacenza, F. D. Sala, et al., "Toward quantum-dot cellular automata units: Thiolated-carbazole linked bisferrocenes," *Nanoscale*, vol. 4, pp. 813-823, 2012.
- [4] T. R. Huff, H. Labidi, M. Rashidi, M. Koleini, R. Achal, M. H. Salomons, *et al.*, "Atomic White-Out: Enabling Atomic Circuitry through Mechanically Induced Bonding of Single Hydrogen Atoms to a Silicon Surface," *ACS Nano*, vol. 11, pp. 8636-8642, 2017.
- [5] I. Eichwald, A. Bartel, J. Kiermaier, S. Breitkreutz, G. Csaba, D. Schmitt-Landsiedel, *et al.*, "Nanomagnetic Logic: Error-Free, Directed Signal Transmission by an Inverter Chain," *IEEE TMag*, vol. 48, pp. 4332-4335, 2012.
- [6] J. Huang, M. Momenzadeh, L. Schiano, M. Ottavi, and F. Lombardi, "Tile-based QCA Design using Majority-like Logic Primitives," *JETC*, vol. 1, pp. 163-185, 2005.
- [7] J. Huang, M. Momenzadeh, and F. Lombardi, "Design of sequential circuits by quantum-dot cellular automata," *Microelectronics Journal*, vol. 38, pp. 525-537, 2007.
- [8] L. Lee Ai, A. Ghazali, S. C. T. Yan, and F. Chau Chien, "Sequential circuit design using Quantum-dot Cellular Automata (QCA)," in *IEEE ICCAS*, 2012, pp. 162-167.
- [9] M. Ottavi, S. Pontarelli, E. P. DeBenedictis, A. Salsano, S. Frost-Murphy, P. M. Kogge, *et al.*, "Partially Reversible Pipelined QCA Circuits: Combining Low Power With High Throughput," *IEEE TNANO*, vol. 10, pp. 1383-1393, 2011.