## Solutions to exam 2019-04-25 in TSTE85 Low Power Electronics

1.

a) *Transition strategies* aim at solving the problem when a component should be switched between different modes.

*Load-change strategies* aim at modifying the functionality of a component so it can be in low-power modes more often.

Adaptation strategies aim at modifying the software to novel, power-saving uses of components.

- b) Use sleep modes, select an architecture with few high frequency devices, amplify signals early in the signal chain, see if special techniques like companding or divide-and-conquer can be applied.
- c) There is no power wasted in driving the extra stray capacitance of the pad, package, and board.
- d) Dynamic power consumption

When a digital circuit is switching, parasitic capacitances are charged or discharged. The charging and discharging consume energy which is dissipated in the NMOS net when a node is pulled down, and in the PMOS net when a node is pulled up. The dynamic power consumption is in proportion to the switching activity, the parasitic capacitance and the square of the power supply voltage.

## Short circuit current

Considering that the rise and fall times of the digital nodes are finite, PMOS and NMOS transistors are simultaneously conducting when the input voltage are in between the threshold voltages of the MOSFETs. Hence, a short circuit current is produced. The power consumption is dependent on the rise and fall time of the input signals, the power supply voltage and the current voltage characteristics of the transistors.

## Static leakage current

Even in a static state, the digital circuits consumes power. This is due to that the transistors have leakage currents in the off state. When a digital node is high there is a leakage current through the NMOS net, and when a node is low there is a leakage current through the PMOS net.

e) The cost of pipelining is extra hardware consisting of registers. The registers increase the area of the circuit and length of the critical path. With pipelining, the throughput of the circuit can increased or the power supply voltage can be lowered meeting a performance less than the increased throughput. By lowering the power supply voltage, power is saved. The advantages of pipelining is that it can be used to save power to the cost of a increased area, and it also reduces the amount of glitching. The drawbacks with pipelining is that the overhead may be large in terms of speed, area, and power if many registers are introduced, e.g., when an extensive level of pipelining is used. At this level, further pipelining will increase the power consumption.

The cost of interleaving is a larger area due to that multiple copies of the original circuit is used, combined with one level of registers, and one level of multiplexers. The multiplexers and registers increase the length of the critical path. With interleaving, the throughput of the circuit can be increased or the power supply voltage can be lowered meeting a performance less than the increased throughput. By lowering the power supply voltage, power is saved. The advantage of interleaving is that the propagation delay increases less than with pipelining. The drawback is the large area increase.

2. The probability that the output of the NAND gate is low is  $P_0 = P_a P_b P_c$ , yielding the probability that the output is high as  $P_1 = 1-P_0$ . The transition activity is  $\alpha_{01} = P_0 P_1 = P_0(1-P_0)$ . Find maximum transition activity:

$$\frac{d\alpha_{01}}{dP_0} = 1 - 2P_0 = 0 \Rightarrow P_0 = \frac{1}{2}, \frac{d^2\alpha_{01}}{dP_0^2} = -2 < 0 \Rightarrow \text{ this is a max value}$$

Hence input probabilities given by  $P_a P_b P_c = 1/2$  yield highest  $\alpha_{01} = 1/4$ . They are  $P_a = P_b = P_c = 2^{-1/3} \approx 0.79$ .

3. If the enable signal only change state when the clock is low, then the gated clock will be correct. If the enable signal change state during the clock is high, then the gated clock may have a pulse that starts to late or ends to early as shown in the figure below, where  $\emptyset$  is the clock, *E* the enable signal, and  $\emptyset_{gated}$  is the gated clock. These effects degrade or destroy the timing of the registers.



4.

a) 
$$I_D = 0.6 \cdot 10^{-6} e^{-21|V_{GTn}|} = k e^{-|V_{GT}|/(n_s V_{\Theta})} \Longrightarrow n_s V_{\Theta} = 21^{-1} \Longrightarrow S = n_s V_{\Theta} \ln(10) \approx 110 \text{ mV/decade}$$

b) 
$$P_{\text{rel}} = 1 - \frac{I_{D1}V_{DD}}{I_{D0}V_{DD}} = 1 - \frac{ke^{-21|V_{GT1}|}V_{DD}}{ke^{-21|V_{GT0}|}V_{DD}} = 1 - e^{21(|V_{GT0}| - |V_{GT1}|)} = 1 - e^{21(|-0.3| - |-0.4|)} \Longrightarrow 1 - e^{-2.1} \approx 88\%$$

c) For example, decreasing the bulk potential of the NMOSFET would increase  $V_{SBn}$ , which increases  $V_{Tn}$ .

- a)  $t_0 = \max(T_1, T_2) = \max(t_{add} + t_{mul}, 2t_{add} + t_{mul}) = 2t_{add} + t_{mul} = 2.2 \text{ ns}$
- b) Retimed algorithm



The new critical path is  $t_1 = t_{add} + t_{mul} = 1.7$  ns

- c) According to  $t_{add}(V_{DD})$  graph, original voltage is  $V_0 \approx 2.5$  V for 0.5 ns delay. Expressing delay in adder delay we get  $t_1 = t_{add}(V_{DD}) + (1.2/0.5)t_{add}(V_{DD}) = 3.4t_{add}(V_{DD})$ . We can scale  $t_1$  to 2.2 ns, yielding  $t_{add}(V_{DD}) = t_1/3.4 = 2.2/3.4$  ns  $\approx 0.65$  ns, which is obtained for  $V_1 \approx 1.5$  V. Relative power saving becomes  $1 - fCV_1^2/(fCV_0^2) = 1 - V_1^2/V_0^2 \approx 74\%$
- 6. Energy harvesting devices ordered from low to high power generation efficiency:
  - A. RF receiver in proximity to WiFi transmitter ~ 1  $\mu$ W/cm<sup>2</sup>
  - D. Indoor solar cell ~ 10  $\mu$ W/cm<sup>2'</sup>
  - B. Thermoelectric generator attached to human ~  $50 \,\mu\text{W/cm}^2$
  - F. Solar cell in office ~  $100 \,\mu\text{W/cm}^2$
  - C. Thermoelectric generator attached to hot machine ~  $10 \text{ mW/cm}^2$
  - E. Outdoor solar cell ~  $10 \text{ mW/cm}^2$
- 7.
- a) 24x48-bit registers (1152 D flip-flops)
- b) Two datapaths where each path has 12x48-bit registers (1152 D flip-flops).



2x12 48-bit registers

5.

c) Register delay of the FIFOs for same throughput:  $t_{new} = 2t_{old}$ 

Min voltage for interleaved case

$$\frac{t_{new}}{t_{old}} = \frac{V_{new}}{\left(V_{new} - V_T\right)^{1.5}} \frac{\left(V_{old} - V_T\right)^{1.5}}{V_{old}} = 2 \Longrightarrow V_{new} \approx 0.69 \text{ V}$$

Half the original clock frequency is needed for interleaved case, yielding relative power saving

$$S_{rel} = 1 - \frac{\frac{f}{2}CV_{new}^2}{fCV_{old}^2} \approx 83\%$$

8.

a) 
$$f_{\overline{c2}} = a_2\overline{b_2} + 0(a_2 + \overline{b_2}) = a_2\overline{b_2}$$
$$f_{c2} = a_2\overline{b_2} + 1(a_2 + \overline{b_2}) = a_2 + \overline{b_2}$$
$$ODC_{c2} = f_{\overline{c2}}f_{c2} + \overline{f_{\overline{c2}}}f_{c2} = a_2\overline{b_2}(a_2 + \overline{b_2}) + \overline{a_2\overline{b_2}}(\overline{a_2 + \overline{b_2}}) = a_2\overline{b_2} + \overline{a_2\overline{b_2}} + (a_2 + \overline{b_2}) = a_2\overline{b_2} + \overline{a_2\overline{b_2}} + \overline{a_2\overline{b_2}} + \overline{a_2\overline{b_2}} = a_2\overline{b_2} + \overline{a_2\overline{b_2}}$$

b) A realization of the complete-disabling precomputed comparator is shown below

