# Power-Gating Scheme and Modeling of Near-Threshold Adiabatic Flip-Flops

## Fangfang Zang, Jianping Hu\*, Wei Cheng

Faculty of Information Science and Technology, Ningbo University 818 Fenghua Road, Ningbo 315211, Zhejiang Province, China \*Corresponding author, e-mail: nbhjp@yahoo.com.cn

#### Abstract

Technology scaling increases the density and performance of nanometer circuits, resulting in both large dynamic and leakage dissipations. This paper presents a power-gating scheme for adiabatic flip-flops operating on near-threshold regions to reduce both dynamic and leakage dissipations. The power-gated logic blocks are realized with complementary pass-transistor adiabatic logic with the dual threshold technique to reduce active leakage dissipations. The improved complementary pass-transistor adiabatic logic circuits are used as the two-phase power-gating switches to reduce the sleep leakage dissipations. The analytical model for power-gating adiabatic sequential circuits was constructed, and the energy overhead of the proposed power-gating scheme was analyzed in detail. Near-threshold computing for a power-gating adiabatic mode-10 counter was verified. The results show that the proposed power-gating technique is suitable for the adiabatic units operating on near-threshold regions.

**Keywords**: near-threshold computing, adiabatic computing, power-gating scheme, energy-efficient designs, nanometer circuits

#### Copyright © 2014 Institute of Advanced Engineering and Science. All rights reserved.

#### 1. Introduction

Before the CMOS process is scaled into 130 nm process, the dynamic power consumption is the most concern in previous low-power designs, since it has always dominated power dissipation [1], [2]. Adiabatic logic circuits, such as complementary pass-transistor adiabatic logic (CPAL) [3], clocked adiabatic logic (CAL) [4], and effective charge recovery logic (ECRL) [5] achieve low dynamic power dissipations.

With the feature size of integrated circuits continuing to decrease, the leakage dissipation caused by leakage currents of MOS devices gradually catches up with the dynamic power dissipations [6]-[8]. The several leakage reduction techniques, such as multi-threshold CMOS (MTCMOS) power-gating technique, dual-threshold CMOS (DTCMOS), and input vector control (IVC) have been proposed in recent years to reduce sub-threshold leakage [8]-[10]. Similar to conventional CMOS circuits, there are leakage dissipations caused by leakage currents of MOS devices in adiabatic circuits [11]. Power-gating techniques have been introduced to reduce the energy loss of adiabatic circuits during idle periods [12].

Voltage scaling for adiabatic circuits is an effective method to reduce their power dissipations, because the dynamic energy is reduced quadratically and leakage dissipation decreases linearly as supply voltage scales down [13]. Several near-threshold adiabatic circuits have been proposed recently. In these circuits, the supply voltage is scaled to medium-voltage region [14]. However, the previously reported near-threshold adiabatic circuits are based on pre-layout simulations and mostly used for basic adiabatic circuits without power-gating.

In this work, we focus on the layout implantations and near-threshold computing of adiabatic flip-flops with power-gating techniques. All circuits are simulated using NCSU PDK 45 nm technology by varying supply voltage from 0.6 V to 1.1 V with 0.1 V steps. This paper is organized as follows. In section 2, the power-gating scheme suitable for near-threshold operating of CPAL sequential circuits is introduced. In section 3, the layout implantations of the power-gating adiabatic mode-10 counter are presented. Modeling and analysis for the energy overhead of the power-gating scheme are also included in this section. Our work of this paper is summarized in the last section.

#### 2. Power-Gating Scheme Suitable for Near-Threshold Regions

The basic CPAL buffer is shown in Figure 1 [3]. It is composed of two main parts: the logic function circuit and the load driven circuit. The logic circuit consists of four NMOS transistors (N5–N8) with complementary pass-transistor logic (CPL). The load driven circuit consists of a pair of transmission gates (N1, P1 and N2, P2). The clamp transistors N3 and N4 make the un-driven output node grounded. The CPAL gate is supplied by a single-phase power-clock. Cascaded CPAL gates are driven by two-phase power-clocks ( $pc_1 - pc_2$ ).

The most straightforward application of the DTCMOS technique is simply to partition a logic cell into critical and non-critical regions, and then to only use fast low- $V_t$  devices in critical paths to meet high performance goals. Figure 2 (a) shows a basic CPAL gate, while Figure 2 (b) shows a DTCMOS gate. The CPL function blocks of the two-input gates used for both basic CPAL and DTCMOS CPAL are shown in Figure 2 (c).



Figure 1. CAPL buffer/inverter, buffer chain, and its simulation waveforms



Figure 2. (a) Basic CPAL gate, (b) DTCMOS CPAL gate, and (c) CPL function blocks for twoinput gates

In Figure 2 (a), the two transistors (N3 and N4) enclosed with dotted lines is the critical region, which are replaced with high- $V_t$  devices, as shown in Figure 2 (b). The leakage currents of the two high- $V_t$  transistors is less than low- $V_t$  ones, so that their leakage dissipations can been reduced.

The adiabatic flip-flops with data-retention techniques should been constructed to store its state at sleep mode. As shown in Figure 1, during  $t_1$  and  $t_2$ , when the power-clock  $clk_1$  is set to low level, the node X stays at a level of  $V_{DD} - V_{TN}$ , where  $V_{DD}$  is peak voltage of power-clocks and  $V_{TN}$  is threshold voltage of NMOS transistors. Therefore, the nodes X and Xb can hold their states. The adiabatic flip-flop with data-retention function has been reported in [12] by using this principle, as shown in Figure 3, which are realized with two-stage cascaded CPAL buffers.

The active enable terminal (*Act*) and refresh enable terminal (*Ref*) are added for powergating operation. Considering that the flip-flops may be at sleep mode for a long time, thus *Ref* is added to refresh their storage values to prevent from losing their states because of leakage current. The data-retention adiabatic flip-flop works in three modes. In hold mode, both *Act* and *Ref* are low, thus the flip-flop holds its state on the nodes  $X_M(X)$ , and  $Xb_M(Xb)$ . In refresh mode, *Act* is low and *Ref* is high. If *Q* or *Qb* follows the power-clock under the control of its storage value, the nodes  $X_M(X)$ , and  $Xb_M(Xb)$  will be refreshed. In active mode, *Act* is high and *Ref* is low, thus flip-flop acts as usual. One of *Act* and *Ref* must be low to prevent from logic fault.



Figure 3. Adiabatic *D* flip-flops with data-retention function.

The power-gating scheme for adiabatic sequential circuits and its simulated waveforms are shown in Figure 4. The power-gating switches are inserted between power-clocks ( $clk_1$  and  $clk_2$ ) and virtual power-clocks ( $pc_1$  and  $pc_2$ ). In sleep mode, the power-gating switches disconnect the adiabatic block from the power-clocks, and so that the virtual power-clocks are shut down. The power-gated adiabatic blocks consist of the flip-flops with data-retention function and adiabatic combinational logic circuits. The refresh control circuit ensures that refreshing is only carried during sleep mode to prevent from logic fault of the data-retention flip-flops.

The improved CPAL circuits are used as power-gating switches, as shown in Figure 5. The bootstrapped NMOS transistors (N1, N2) of the power-gating switches usually use large sizes to attain enough driving ability, so that the node *X* would be light bootstrapped, resulting in a short-circuit current from *clk* to GND. Therefore, in improved CPAL power-gating switches, the two transistors (N9 and N10) are added to make the un-driven node (*X* or *Xb*) grounded.

The dynamic energy of the conventional CMOS circuits decreases quadratically as supply voltage scales down [13]. Simular to conventional CMOS circuits, voltage scaling for adiabatic circuits is also an effective method to reduce their power dissipations. The power-gating switches operating on near-threshold region should be realized using energy-efficient adiabatic drivers. As shown in Figure 5, the voltage of the internal nodes (*X* and *Xb*) can be bootstrapped to a higher voltage than  $V_{DD}$ - $V_{TH}$ , resulting in a reduced turn-on resistance even at low source voltage, thus it is energy efficient for near-threshold operating.

The operation modes of the power-gating adiabatic sequential circuit are summarized as Table 1. The power-gated sequential circuits work in three modes under the control of *Active* (active control signal) and *Refresh* (refresh control signal).



Figure 4. (a) Power-gating scheme for adiabatic circuits, and (b) its simulated waveforms



Figure 5. Power-gating switch using improved CPAL, and its simulated waveforms

When both *Active* and *Refresh* are 0, the virtual power-clocks are shut down, and thus the power-gated circuits works in sleep(hold) mode. In this operating mode, the nodes  $X_M$  and  $Xb_M$  of the flip-flops will holds its states regardless of its inputs, because both *Act* and *Ref* are 0.

When Active is 0 and Refresh is 1, the virtual power-clocks follow the power-clock. Because Act is 0 and Ref is set to 1, Q or Qb follows the virtual power-clock under the control of its storage value, and then the nodes  $X_M$  and  $Xb_M$  are refreshed by the rising Ref and Q (or Qb). Therefore, the power-gated circuits operate in refresh mode.

As long as *Active* is 1, the virtual power-clocks follow the power-clock. Because *Ref* is set to low level and *Act* is 1, the flip-flops act as usual. Therefore, the power-gated circuits operates in active mode.

Table 1. Operation modes of the power-gated adiabatic sequential circuits

| Active | Refresh | Act | Ref | Operation modes |
|--------|---------|-----|-----|-----------------|
| 0      | 0       | 0   | 0   | Sleep (Hold)    |
| 0      | 1       | 0   | 1   | Refresh         |
| 1      | 0       | 1   | 0   | Active          |

#### 3. Modeling and Analysis for Energy Overhead of Power-Gating Circuits

The power-gating introduces an additional energy. The analytical model of power-gating switches should be constructed. The power-gating operation schedule is shown in Figure 6.

The additional energy  $E_{ON}$  and  $E_{OFF}$  are needed to turn on and off the switches between sleep and active modes. For turning on or off the switches, the node X or Xb is charged from 0 V to  $V_{DD} - V_{TN}$ . Therefore, energy loss for turning on and turn off the two switches is respectively

$$E_{ON} = 2 \times (1/2) C_X (V_{DD} - V_{TN})^2$$
 and  $E_{OFF} = 2 \times (1/2) C_{Xb} (V_{DD} - V_{TN})^2$  (1)

where  $C_X$  and  $C_{Xb}$  is capacitance of the nodes X and Xb, respectively.  $E_{ON}$  and  $E_{OFF}$  are approximately proportional to the channel width of the transistors N1 and N2.



Figure 6. Operation schedule and energy loss of the power-gating circuits

The equivalent circuit of the power-gating switch in active mode is shown in Figure 7(a). In active mode, the TG (transmission-gate) (N1, P1) is on, and  $R_1$  is its turn-on resistance. TG (N2, P2) is off, and thus its energy loss can be ignored.  $C_{pc}$  is capacitance of the node *pc*. Because  $C_{pc}$  is much less than  $C_{AL}$ , it can be also ignored. The power-gated logic block is modeled by a capacitor  $C_{AL}$  and a resistor  $R_{AL}$ , which are given by

$$R_{AL} = E_{AL} f / I^2 \quad \text{and} \quad C_{AL} = \sqrt{2I} / (\pi V_{DD} f)$$
<sup>(2)</sup>

where *I* is the rms current through  $R_{AL}$ ,  $E_{AL}$  is the energy loss per cycle of the power-clock *pc*, and *f* and  $V_{DD}$  are frequency and peak-to-peak voltage of *pc*, respectively.

Since the gate-to-source voltage  $V_{GS, N1}$  of the transistor N1 is almost a constant ( $V_{DD} - V_{TN}$ ) because of bootstrapping, and P1 uses small device size that is only supplementary for bootstrapping, the turn-on resistance  $R_1$  can be approximately

$$R_{I} = 1/[\mu C_{OX}(W/L)(V_{DD} - V_{TN})]$$
(3)

where  $\mu$  is the carrier mobility, and W and L are the channel width and length of the transistor N1, respectively. The energy loss per cycle of the power-gating adiabatic circuits in active time including the two switches can be written as

$$E_{active} = 2(\frac{2(R_I + R_{AL})C_{AL}}{T/2})C_{AL}V_{DD}^{2}$$
(4)

where T is period of the power-clocks. The energy dissipations  $E_{active}$  is in inverse proportion to the channel width of the transistor N1 according to (3) and (4).



(a) Equivalent circuit of power-gating switch in active mode



N10

 $R_2$ 

nch

Figure 7. Equivalent circuits of power-gating switches

The equivalent circuit of the power-gating switch in sleep mode is shown in Figure 7(b). In sleep mode, the TG (N1, P1) is off, and  $I_{leakage}$  is its leakage current, which should be taken into account because of large transistor sizes. The TG (N2, P2) is on, and  $R_2$  is its turn-on resistance and  $C_{pcb}$  is capacitance of the node *pcb*. The energy loss per cycle in sleep time introduced by the two switches can be written as

$$E_{sleep} = 2I_{leakage}T + 2(\frac{2R_2C_{pcb}}{T/2})C_{pcb}V_{DD}^{2}$$
(5)

In (5), the first term is the leakage energy loss caused by the TG (N1, P1), which is proportional to its channel width. The second term is the charging and discharging energy loss for the node *pcb*, which is in inverse proportion to the channel width of the TG (N2, P2).

Average energy loss per cycle of ther power-gating adiabatic circuits including the two switches can be calculated as

$$E_{AV} = \frac{E_{active} T_{active} / T + E_{sleep} T_{sleep} / T + E_{on} + E_{off}}{(T_{active} + T_{sleep}) / T}$$
(6)

where  $T_{active}$  is active time,  $T_{sleep}$  is sleep time, and *a* is active ratio that is defined as  $T_{active}/(T_{active}+T_{sleep})$ . The energy savings depend on the active ratio, the switch sizes, and the sleep time. When  $T_{sleep}$  is long enough, the switching energy loss ( $E_{on}$  and  $E_{off}$ ) can be ignored, and the average energy dissipation  $E_{AV}$  per cycle of the power-gating adiabatic circuits including the two power-gating switches can be written as

$$E_{AV} = (E_{active})\alpha + E_{sleep}(1-\alpha)$$
<sup>(7)</sup>

 $E_{active}$  can be reduced by increasing channel width of the channel width of the transistor N1 according to (3) and (4), whereas  $E_{sleep}$  can be reduced by reducing its channel width according to (5). Therefore, the optimal sizes of the transistors N1 can be chosen to minimize the energy overhead. For the transmission gate (N2, P2), we can also choose its optimal size to minimize the energy dissipation according to (1) and (5).

### 4. Results and Discussions

In order to show energy efficiency of the proposed power-gating scheme for adiabatic circuits on near-threshold regions, a mode-10 counter based on the adiabatic flip-flop with the data-retention function is verified, as shown in Figure 8 (a). Figure 8 (b) shows the layout of the

mode-10 counter including the power-gating switches, which is implemented using NCSU PDK 45 nm technology.



Figure 8. (a) The adiabatic mode-10 counter, and (b) its layout with power-gating switches

The post-layout simulations are carried out by varying supply voltage from 0.6 V to 1.1 V with 0.1 V steps. The maximum operating frequency of the mode-10 counter is obtained, where the mode-10 counter has correct logic function, as shown in Figure 9. The maximum operating frequency is reduced as the supply voltage scales down.

In order to investigate the performances of the mode-10 counter with power-gating switches in near-threshold region, the energy dissipation are also obtained using HSPICE postlayout simulations, as shown in Figure 9.



Figure 9. Max operating frequency and energy dissipation of the mode-10 counter including the power-gating switches at various supply voltages. The active ratio is 0.4

Figure 10 shows the energy dissipations of the mode-10 counter including power-gating switches in different frequencies. As shown Figure 10, the post-layout simulation results show that the more energy saves of the power-gating mode-10 counter are attained, as the active ratio is down. From (7), the energy dissipation of the power-gating mode-10 counter is linearly reduced as the active ratio is reduced, because  $E_{sleep}$  can be ingored.



Figure 10. Energy dissipations of the mode-10 counter including the power-gating switches in different frequencies. a is active ratio

## 5. Conclusion

With the improvement of technology, leakage dissipation has potentially become a dominant component of total power dissipations in nanometer circuits, resulting in both large dynamic and leakage dissipations. Voltage scaling is an effective method to reduce the power dissipations in adiabatic circuits, because the dynamic energy is reduced quadratically, and leakage dissipation decreases linearly as supply voltage scales down.

In this paper, the layout implantations and near-threshold computing of adiabatic flipflops with power-gating techniques have been presented to reduce both dynamic and leakage dissipations in nanometer circuits. The improved CPAL circuits are used as the two-phase power-gating switches to reduce the sleep leakage dissipations of the adiabatic circuits. The power-gated logic blocks are realized with CPAL circuits with the dual threshold CMOS technique to reduce active leakage dissipations of the adiabatic circuits. A near-threshold counter based on the CPAL circuits with NCSU PDK 45 nm technology is used to verify the proposed power-gating technique. Both active and sleep leakage dissipations are effectively reduced by using the power-gating scheme and dual threshold CMOS technique. The results show that the proposed power-gating technique is suitable for the adiabatic units operating on near-threshold region.

#### Acknowledgments

This work was supported by National Natural Science Foundation of China (No. 61271137 and No. 61071049).

#### References

- [1] Kim NS, Austin T, Blaauw D, Mudge T, Flautner K, Hu JS, Irwin MJ, Kandemir M, Narayanan V. Leakage current: Moore's Law meets static power. Computer. 2003; 36 (12): 68-75.
- [2] Zhang WQ, Su L, Zhang Y, Li LF, Hu JP. Low-leakage flip-flops based on dual-threshold and multiple leakage reduction techniques. Journal of Circuits, Systems and Computers. 2011; 20(1): 147-162.

- [3] Hu JP, Xu TF, Li H. A lower-power register file based on complementary pass-transistor adiabatic logic. *IEICE Trans. on Information and Systems.* 2005; E88-D(7): 1479-1485.
- [4] Maksimovic D, Oklobdzija VG, Nikolic B, Current KW. Clocked CMOS adiabatic logic with integrated single-phase power-clock supply. *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*. 2000; 8(4): 460-463.
- [5] Moon Y, Jeong DK. An efficient charge recovery logic circuit. IEEE J. Solid-State Circuits. 1996; 31(4): 514-522.
- [6] Hu JP, Yu XY. Low voltage and low power pulse flip-flops in nanometer CMOS processes. Current Nanoscience. 2012; 8(1): 102-107.
- [7] Agarwal A, Kim CH, Mukhopadhyay S, Roy K. *Leakage in nano-scale technologies: Mechanisms, impact and design considerations*. <sup>41</sup>th annual Design Automation Conference. 2004: 6-11.
- [8] Fallah F, Pedram M. Standby and active leakage current control and minimization in CMOS VLSI circuits. *IEICE trans. on Electronics.* 2005; E88-C(4): 509-519.
- [9] Kim KK, Kim YB, Choi M, Park N. Leakage minimization technique for nanoscale CMOS VLSI. *IEEE Design and Test of Computers*. 2007; 24,(4): 322-330.
- [10] Rana AK, Chand N, Kapoor V. Gate leakage reduction through the use of a gate-to-source/drain nonoverlapped metal–oxide–semiconductor field-effect transistor structure. *Journal of Nanoengineering* and Nanosystems. 2011; 224(4): 173-181.
- [11] Zhang WQ, Yu L, Hu JP. Dual-threshold and gate-length biasing of pass-transistor adiabatic logic with PMOS pull-up configuration. Advances in Information Sciences and Service Sciences. 2012; 4(12): 98-106.
- [12] Zhou D, Hu JP, Wang L. Adiabatic flip-flops for power-down applications. IEEE International Symposium on Integrated Circuits. 2007: 493-496.
- [13] Markovic D, Wang CC, Alarcon LP, Liu TT, Rabaey JM. Ultralow-power design in near-threshold region. *Proceedings of the IEEE*. 2010; 98(2): 237-252.
- [14] Wu WB, Hu JP. Near-threshold computing of CAL-CPL circuits. *Journal of Low Power Electronics*. 2011; 7(3): 393-402.