Run-Time Power Gating in Hybrid ARM-FPGA Devices

Mohammad Hosseinabady and Jose Luis Nunez-Yanez
Department of Electrical and Electronic Engineering University of Bristol, UK.
Email: {m.hosseinabady, j.l.nunez-yanez}@bristol.ac.uk

Abstract—Energy proportional computing (EPC) enables the allocation of energy to tasks depending on computational demands. Computing at full speed and then dynamically turning off modules when they are not required for a period of time can be used to obtain EPC and it is an alternative to voltage scaling techniques in which the computation is slowed down. This paper investigates the viability of physical power gating FPGA devices that incorporate a hardened processor in a different power domain. The run-time power gating approach is applied to Xilinx ZYNQ devices that incorporate a hardened Cortex A9 multi-processor. The paper demonstrates that power down followed by a full reconfiguration can be controlled by the embedded processor autonomously. The results show that the minimum time that the FPGA fabric must remain in power-off state for the technique to be energy efficient is in the order of milliseconds and up to 96% power reduction occurs when the fabric voltage is lowered below critical level. These results take into account the overheads of controlling the programmable voltage regulators interfaced to the FPGA and the overhead of the reconfiguration needed when the device must be returned to the active state.

Keywords—Energy Proportional Computing, Power Gating, ZYNQ, FPGA

I. INTRODUCTION

Energy consumption is one of the main constraints in the new area of embedded and mobile multi/many-core heterogeneous systems [1]. Energy proportional computing (EPC) has emerged as a solution to restrict the energy to the exact amount required by a respective application [2]. The dynamic behaviour of EPC requires that the target system provides fine grained software and hardware reconfiguration features. Modern FPGAs (Field Programmable Gate Arrays) offer partial and dynamic reconfiguration and recent research has shown that they are also suitable for dynamic voltage and frequency scaling [3]. This makes them a suitable candidate to realise the underlying hardware required to implement an energy proportional computing platform.

One of the effective techniques in EPC is the power gating of unused modules [4][5] in which the modules that are not used for a certain amount of time are shut down and then turned on whenever they are required. This technique reduces the energy consumption caused by leakage and clock activity while the module is not required in a system.

This paper investigates a software-controlled power gating technique applied to the Zynq-7000 All Programmable SoC platform [6]. The Zynq-7000 consists of a dual-core ARM Cortex-A9 processor as the processing system (PS) and a Xilinx 7 series FPGA as the programmable logic (PL). The PL and PS are in two different power domains supplied by programmable voltage regulators. This enables the PS to control the PL power rails programmatically. In this investigation, run-time software-controlled PL power gating is used to reduce the energy consumption in an application containing idle modules. In this technique, when the hardware design in the PL is in its idle mode or it is not required any more, the PL power can be reduced by clearing the PL configuration data and reducing its power supplies. In order to reuse the PL, its power supplies should be increased to nominal levels and it should be reconfigured again. The usage of this technique is twofold: shutdown the PL to reduce power and replace a hardware module with another which increases the hardware utilization and delivers EPC. The experimental results show that for a large design in the PL which contains five MicroBlaze soft processor cores [7] to save energy the module idle time should be longer than 42.58msec. In this case, the idle module power reduces by the factor of 30.43. For the other smaller designs the idle module power reduces by the factor between 2.56 and 17.05.

The rest of this paper is organized as follows. The next section reviews the previous work on power-gating in FPGAs. The motivations and contributions of this paper are explained in Section 3. Section 4 discusses the details of the proposed technique and its accuracy. Experimental results are presented in Section 5. Finally, Section 6 concludes the paper.

II. PREVIOUS WORK

Power gating in which some parts or the design are shut down for a period has been proposed as a low-power technique in the FPGA area especially for reducing leakage power. Researchers have proposed techniques to implement power gating in different levels of design granularity including transistor level, gate level, look-up table level, and module level.

The basic idea for power gating is adding a switch to the design in order to disconnect the power supply from the circuit. At the transistor level, a single transistor called sleep transistor can realise this switch as shown in Fig. 1. The SLEEP input in Fig. 1 controls the sleep transistor. When the sleep transistor is off the leakage power is limited to that of the sleep transistors which is negligible. Considering this sleep transistor, researchers have proposed different power gating techniques for FPGA architectures. Note that in the FPGA area, as well as the performance overhead caused by sleep transistors, adding these transistors requires modification to the underlying FPGA fabric. Therefore, researchers usually use simulation techniques along with synthetic applications to evaluate their proposed techniques. In contrast to the logic design techniques reviewed in this section, the technique proposed in this paper can be applied to commercially available FPGAs and controls the FPGA power lines directly by adjusting the voltage levels at the output of the programmable voltage regulators which provide power to the fabric.

Statically control of the sleep transistor using FPGA configuration data is presented in [8]. This technique utilises a
new design methodology by grouping the design into clusters based on temporal locality and then turn-off/on clusters by configuration data.

Dynamically controlled power gating is presented in [9] in which the SLEEP signals are connected to general-purpose routing fabrics in the FPGA and are controlled by separate circuits or circuits part of the FPGA itself. This technique, can switch off a specific part of the design or a logic cluster in FPGA. This technique is evaluated by HSPICE simulation using an application model. In contrast, the technique proposed in this paper is applied to a commercial FPGA and can be used by all designers.

Ishihara et al. [10] propose a look-up table level power gating technique in which the power of the look-up tables with fine granularity can be controlled by a sleep transistor. This techniques relies on a new FPGA architecture and it is not applicable to commercially available FPGAs.

FPGA manufactures have used different static power gating at the module level to shut down unused PLLs, DCMs, I/Os during the FPGA configuration. For example, as block RAMs are the source of the 30% of total leakage power in an FPGA [11], Xilinx provides static power gating techniques for block RAMs in 28nm 7-series devices. Therefore, only the block RAMs used by a design cause leakage.

III. Motivations and Contributions

Using a simple design, this section explains the motivations and contributions behind this research. Before delving into the details, the next subsection describes the underlying platform considered in this research.

A. Zynq-7000 platform description

The ZC702 [6] evaluation board is used to perform this research. The board utilises a Zynq-7000 SoC consisting of a dual-core ARM Cortex-A9 processor as the processing system (PS) and a Xilinx 7-series FPGA as the programmable logic (PL). The PS and PL power domains are completely independent [6]. Each PS and PL has six different power pins to provide power for their different parts. As this paper only focuses on PL power gating, the power lines for PL are mentioned here. Interested readers can refer to [6] for more details on these power pins. \(V_{\text{CCINT}}\), \(V_{\text{CCAUX}}\), \(V_{\text{CCO}}\), \(V_{\text{CCO#}}\), \(V_{\text{CC_BATT}}\), \(V_{\text{CCBRAM}}\), and \(V_{\text{CCAU}x_{\text{JO_GB}}}\) are the PL power pins [6]. The two PL power lines that are considered in this paper are \(V_{\text{CCINT}}\) and \(V_{\text{CCAUX}}\). Whereas \(V_{\text{CCINT}}\) provides the power for the internal core logic, the \(V_{\text{CCAUX}}\) provides the power for auxiliary logic such as I/O buffer pre-drivers, along with Mixed-Mode Clock Managers (MMCMs) and PLLs.

B. Motivations

A simple motivation example is considered in this subsection. The example consists of a MicroBlaze soft processor core configured in the PL and the ARM processor available in the PS. The MicroBlaze runs two different programs, separately. The first one just prints messages on the console and the second one performs a 32 × 32 floating point matrix multiplication. Figure 1 shows the PL resources used by this design.

![Figure 1: Basic idea of power gating in logic design](image)

Fig. 1: Basic idea of power gating in logic design

TABLE I: PL resources used by the MicroBlaze example

<table>
<thead>
<tr>
<th>Slice LUTs</th>
<th>Slice Register</th>
<th>MMCM</th>
<th>DSP48E</th>
<th>RAM36E</th>
</tr>
</thead>
<tbody>
<tr>
<td>1725</td>
<td>1525</td>
<td>1</td>
<td>3</td>
<td>4</td>
</tr>
</tbody>
</table>

Fig. 2(a) shows the contribution of each part of the PL design in the consumed power obtained by Xilinx XPower Analyzer [12] without applying any input patterns. As it can be seen, the MMCM module is the most power intensive due to clock activity. Leakage is the next important part in the consumed power. Logic and signals show a negligible contribution in the power consumption. Note that, leakage power is consumed as long as the design exist in the PL, whether the design is used or is in the idle mode. A real measurement for the same design on the ZC702 board, depicted in Fig. 2(b), shows that \(V_{\text{CCINT}}\) and \(V_{\text{CCAUX}}\) are the two power rails that provide most of the power consumed by the PL. Since \(V_{\text{CCINT}}\) provides the voltage for logic circuits in the PL then it mainly represents the leakage power due to lack of input activity in this measurement. In addition, as \(V_{\text{CCAUX}}\) provides the voltage for MMCM, it represents the power consumption caused by the clock activity as well as the leakage power in the related logic.

![Figure 2: Power consumption](image)

Fig. 2: Power consumption

C. Contributions

The main contribution of this paper, compared to the previous work, is to investigate software-controlled run-time power-
The time and power associate to this period are denoted by $t_{state}$ of the design should be saved to be used after resumption. Below. As the PL loses its configuration after turning off, the three periods for turning on the PL. These periods are explained. This timeline shows three periods for turning off the PL and it is powered off/on and reconfigured by the PS. Fig. 4

D. Problem formulation

This subsection formulates the contributions of this research in which the PL is shut down when its design is in idle mode and it is powered off/on and reconfigured by the PS. Fig. 4 shows the PL power gating timeline considered in this research. This timeline shows three periods for turning off the PL and three periods for turning on the PL. These periods are explained below. As the PL loses its configuration after turning off, the state of the design should be saved to be used after resumption. The time and power associate to this period are denoted by $t_{ss}$ and $P_{ss}$. To turn off the PL, the signalling between PS and PL should be terminated and the voltage levels of the corresponding power supplies should be reduced. The $t_{tfpl}$ and $P_{tfpl}$ represent the time and power associated to this period, respectively. After turning off the PL, there is no any signalling between PS and PL and PL power consumption is almost zero. The time and power associated to this period are denoted by $t_{pltf}$ and $P_{pltf}$. If we want to reuse the PL, its power supplies should be increased to the nominal values. The $t_{tnpl}$ and $P_{tnpl}$ determine the time and power associated to this period, respectively. The next period (after turning on the PL) is reconfiguring the PL and associated time and power are denoted by $t_{reconf}$ and $P_{reconf}$. Finally, the last period is to restore the previous state and the associated time and power are denoted by $t_{rs}$ and $P_{rs}$.

The total power consumption during these periods is

$$P_{powerGating} = P_{ss} + P_{tfpl} + P_{pltf} + P_{tnpl} + P_{reconf} + P_{rs}$$

(1)

For simplicity, a constant average power is considered for each period. With this assumption, the total energy during this periods equals to

$$E_{powerGating} = t_{ss}.P_{ss} + t_{tfpl}.P_{tfpl} + t_{pltf}.P_{pltf} + t_{tnpl}.P_{tnpl} + t_{reconf}.P_{reconf} + t_{rs}.P_{rs}$$

(2)

To save energy using the PL turn-off technique, this energy should be less than the energy consumed by the PL in the idle state, i.e.,

$$E_{powerGating} < E_{plidle}$$

(3)

in which $E_{plidle}$ denotes the energy consumed by PL when its design is idle, without considering the power gating technique. This energy is equal to the multiplication of $t_{plidle}$ and $P_{plidle}$ which denote the duration of PL in idle model and its corresponding average power consumption, respectively.

By substituting the $E_{powerGating}$ and $E_{plidle}$ in Equ. 3 with the corresponding expression, Equ. 4 can be obtained.

$$t_{ss}.P_{ss} + t_{tfpl}.P_{tfpl} + t_{pltf}.P_{pltf} + t_{tnpl}.P_{tnpl} + t_{reconf}.P_{reconf} + t_{rs}.P_{rs} < t_{plidle}.P_{plidle}$$

(4)

In addition, to use the PL turn-off technique the PL idle time should greater than the technique timing overhead, therefore:

$$t_{plidle} > t_{ss} + t_{tfpl} + t_{tnpl} + t_{reconf} + t_{rs}$$

(5)

If we assume $P_{pltf}$ is very small and negligible then, to save energy, PL idle time should satisfy Equ. 6. using Equ. 4

$$t_{plidle} > (t_{ss}.P_{ss} + t_{tfpl}.P_{tfpl} + t_{tnpl}.P_{tnpl} + t_{reconf}.P_{reconf} + t_{rs}.P_{rs})/P_{plidle}$$

(6)

The minimum PL idle time for which PL power-off can save energy is called power-off-efficiency time in this paper and should satisfy Equs. 5 and 6.

IV. PL POWER GATING FRAMEWORK

This section explains the PL power gating framework and its accuracy.

A. Framework setup

A framework has been developed to implement the PL power gating technique and power measurement in the ZC702 evaluation board. Fig. 5 shows an overview of this framework. It consists of three components: PS, PL, and UCD9248 which is a digital PWM controller. The ZC702 board utilises three UCD9248 digital PWM controllers to control PL and PS voltages and monitor energy consumption in different parts of the system. This controller supports a wide range of PMBus [13] commands including voltage/current

![Image](https://via.placeholder.com/150)

(a) MicroBlaze processor in PL printing messages on terminal

![Image](https://via.placeholder.com/150)

(b) MicroBlaze processor in PL running floating point matrix multiplication

Fig. 3: MicroBlaze power consumption
parameters include \( \text{V}_{\text{OUT}} \), \( \text{I}_{\text{OUT}} \), \( \text{V}_{\text{POWER}} \), \( \text{V}_{\text{CCINT}} \), \( \text{V}_{\text{CCAUX}} \), \( \text{V}_{\text{CCBRAM}} \), \( \text{V}_{\text{CCV3}} \), \( \text{V}_{\text{CCV5}} \), \( \text{V}_{\text{CCPINT}} \), \( \text{V}_{\text{CCPAUX}} \), \( \text{V}_{\text{CCADJ}} \), \( \text{V}_{\text{CCCVR}} \), \( \text{V}_{\text{CCUV}} \), and \( \text{V}_{\text{CCUV2}} \). These are used to adjust the voltage levels at the output of UCD9248. The details of these commands can be found in [14]. The second group of functions adjust the voltage levels at the output of UCD9248 which are used to turn-off and turn-on PL. These functions use the PMBus commands and UDC9248 driver provided by Xilinx. Algorithm 1 contains the PMBus commands for reducing the \( \text{V}_{\text{CCINT}} \) power rail connected to PL. The algorithm consists of three steps: pre-adjustment, adjustment, and post-adjustment. Lines 3-8 modify the lower voltage level protections, Line 10 sends the PMBus command to change the voltage level at the

Reconfiguration functions in the software part use the \text{xde-vcfg} driver through Processor Configuration Access Port (PCAP) interface which is 32 bits wide and clocked at 100 MHz and provides 400 MB/s download via non-secure PL configuration mechanism[6]. These functions use a Direct Memory Access (DMA) mechanism to transfer the configuration bitstream to the PL part at run-time.

B. Framework Accuracy

PL shut-down is an option considered by the manufacture to save power [16]. According to the Zynq-7000 technical reference [16], the sequence for turning-off the PL consists of three steps: stop using signals between PS and PL, disable PS-PL level shifter, and shut-down PL. To get the biggest benefit of PL power gating, all PL power supplies (i.e., \( \text{V}_{\text{CCINT}}, \text{V}_{\text{CCBRAM}}, \text{V}_{\text{CCAUX}}, \text{V}_{\text{CCAUX10}}, \) and \( \text{V}_{\text{CCO}} \)) should be turned off in the correct order. However, the ZC702 board has a deficiency in revision 1.0 that uses \( \text{V}_{\text{CCAUX}} \) to supply power to a clock generator for the PS therefore, turning off the \( \text{V}_{\text{CCAUX}} \) freezes the PS which is not desired. Hence, this research, instead of completely shutting down the PL, reduces the PL \( \text{V}_{\text{CCINT}} \) voltage to a level in which the PL loses its configuration and significantly lower PL energy. As the \( \text{V}_{\text{CCINT}} \) goes below \( \text{V}_{\text{DRINT}} \) [17](i.e., data retention voltage) the PL configuration is lost. By reducing the \( \text{V}_{\text{CCINT}} \) voltage, Fig. 6 shows the changes in power consumption in the PL provided by \( \text{V}_{\text{CCINT}} \) and \( \text{V}_{\text{CCAUX}} \) for the motivation example. As it is expected reducing the \( \text{V}_{\text{CCINT}} \) reduces the corresponding power consumption. Fig. 6 shows that at the \( \text{V}_{\text{CCINT}} \) voltage of around 0.55\( \text{V} \), the PL loses its configuration such that \( \text{V}_{\text{CCAUX}} \) does not deliver power to the circuits, especially to MMCMs.
This results in a sharp drop in $V_{CCaux}$ power. The voltage level 0.4V is investigated as a critical level to turn off the PL. At this voltage level the ZC702 board shows a stable behaviour such that it can be turned on without instability in other parts of the system.

Fig. 7 shows the $V_{CCINT}$ voltage reduction time it takes to reach 0.4v. As it can be seen it takes $2.73\text{msec}$ to reduce this voltage. As shown in Algorithm 1, adjusting the $V_{CCINT}$ voltage consists of three steps: pre-adjustment, adjustment and post-adjustment. The timing delay for changing the $V_{CCINT}$ comprises of pre-adjustment and adjustment execution times as well as the time required to change the voltage from 1v to 0.4v. This delay can be considered as the timing overhead for turning off the PL.

$$T_{tfpl} = T_{pre-adjustment} + T_{adjustment} + T_{V_{CCINT}(1\to0.4)} \quad (7)$$

These times, shown in Equs. 8-10 are measured for the motivation example. Note that these times are not dependant on the design in the PL, because they are defined by the UCD9248.

$$T_{pre-adjustment} = 1.9\text{msec} \quad (8)$$
$$T_{adjustment} = 205\mu\text{sec} \quad (9)$$
$$T_{V_{CCINT}(1\to0.4)} = 2.73\text{msec} \quad (10)$$

An algorithm similar to Algorithm 1 can be used to turn-on the PL. Therefore,

$$T_{tnpl} = T_{pre-adjustment} + T_{adjustment} + T_{V_{CCINT}(0.4\to1)} \quad (11)$$

After turning on the PL, the PS uses PCAP to reconfigure the PL. Before reconfiguration, PS consumes the power corresponding to the running OS and software applications. During reconfiguration, which lasts about $34.33\text{msec}$, the PS power consumption increases by an average of 20mW. The configuration time consists of PL initialization and DMA transfer delay.

V. EXPERIMENTAL RESULTS

This section explains experimental results obtained by applying the proposed technique to four different designs in the PL. Table II shows statistics of FPGA resources used by these designs. The smallest design is an AXI-timer which is a simple 32-bit counter connected to PS via AXI bus interface. The second design is a motion estimation processor [18]. The third design consists of a MicroBlaze in the PL. Finally, the last and largest design contains five MicroBlazes. Note that, we have assumed that these four designs do not require to keep their states during PL turn-off and the state saving/recovery timing overhead is zero.

Table III shows the timing overhead for turning off, turning on, and reconfiguring the PL. As mentioned in Subsection IV-B, the tuning off/on timing is determined by the UCD9248 and are independent of the design in the PL. In addition, the PL reconfiguration time depends on the bitstream file size. Since for each FPGA the size of the bitstream file is constant, and for the Zynq 7020 FPGA used in this research the PL configuration file used by PCAP is 4,045,564 bytes then the configuration time is about $34.33\text{msec}$. This configuration time, which consists of PL initialization timing overhead and bitstream transferring delay is measured between the PL initialization call and the time when the FPGA DONE signal is asserted [6]. Note that the PCAP theoretical throughput is 400 MB/s resulting in $10\text{msec}$ delay for transferring 4,045,564 bytes. Considering the configuration software overhead the throughput is lower than this maximum value. Kohn [19] has reported $32\text{msec}$ and $44\text{msec}$ for only DevC DMA transfer delay in standalong and Linux environments, respectively. Vipin and Fahmy [20] has proposed a partial reconfiguration management technique called ZyCAP that reaches 382 MB/s throughput for PL configuration. Another partial reconfiguration management technique proposed by [21] reaches 385 MB/s throughput for PL partial reconfiguration. However, these work have used the PL itself to achieve this throughput which makes that suitable for partial reconfiguration not PL complete configuration.

<table>
<thead>
<tr>
<th>TABLE III: Timing overheads in $\text{msec}$</th>
<th>Period</th>
<th>Time (msec)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Turn-off PL</td>
<td>4.84</td>
<td></td>
</tr>
<tr>
<td>Turn-on PL</td>
<td>4.84</td>
<td></td>
</tr>
<tr>
<td>Reconfiguration PL</td>
<td>34.33</td>
<td></td>
</tr>
</tbody>
</table>

By reducing $V_{CCINT}$ to 0.4V, the PL loses its configuration and PL power consumption reduces to 20.7mW. Table IV shows the PL idle power consumption, PL turn-off power consumption and power reduction for different design examples. As it is expected, the power reduction of the PL shut down is high, however, to make this technique efficient in an EPC scenario, the PL shut down period should be longer than a minimum value which is called turn-off-efficiency time as mentioned in Subsection III-D. Table V shows the lower level times determined by Equs. 5 and 6 for the four designs. As it can be seen, the turn-
off-efficiency times are determined by the timing overhead of the proposed technique and are 42.5\text{msec}. Table VI shows the percentage of the energy reduction during the idle time considering the turn-off-efficiency time. Note that, the proposed technique shows much more energy reduction if the PL idle time is much larger than the turn-off-efficiency time.

### TABLE IV: Power consumption comparison

<table>
<thead>
<tr>
<th></th>
<th>AXI-Timer</th>
<th>Motion Estimation</th>
<th>One-Microblaze</th>
<th>Five-Microblaze</th>
</tr>
</thead>
<tbody>
<tr>
<td>Power of the PL in idle mode (W)</td>
<td>0.053</td>
<td>0.15</td>
<td>0.353</td>
<td>0.63</td>
</tr>
<tr>
<td>Power of the PL shut down (W)</td>
<td>0.0207</td>
<td>0.0207</td>
<td>0.0207</td>
<td>0.0207</td>
</tr>
<tr>
<td>Power reduction</td>
<td>60.9%</td>
<td>86.2%</td>
<td>94.1%</td>
<td>96.7%</td>
</tr>
</tbody>
</table>

### TABLE V: Turn-off-efficiency time in msec

<table>
<thead>
<tr>
<th></th>
<th>AXI-Timer</th>
<th>Motion Estimation</th>
<th>One-Microblaze</th>
<th>Five-Microblaze</th>
</tr>
</thead>
<tbody>
<tr>
<td>Lower level time by Eqn. 6</td>
<td>18.6</td>
<td>6.5</td>
<td>2.7</td>
<td>1.5</td>
</tr>
<tr>
<td>Turn-off-efficiency time</td>
<td>42.58</td>
<td>42.58</td>
<td>42.58</td>
<td>42.58</td>
</tr>
</tbody>
</table>

### TABLE VI: Energy reduction in %

<table>
<thead>
<tr>
<th></th>
<th>AXI-Timer</th>
<th>Motion Estimation</th>
<th>One-Microblaze</th>
<th>Five-Microblaze</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>56.4</td>
<td>84.92</td>
<td>93.45</td>
<td>96.33</td>
</tr>
</tbody>
</table>

As it can be seen, the FPGA reconfiguration time is the dominant component in the turn-off-efficiency time parameter. Therefore, as a rule of thumb, this reconfiguration time can be considered as the timing overhead of PL turn-off for the ZYNQ platform. In addition, the efficiency of this technique depends on the power consumption of the PL during its idle period. Note that, this power can also be reduced by other techniques such as voltage/frequency scaling or clock gating.

### VI. Conclusions and Future Work

This paper has demonstrated a run-time software-controlled power gating technique applied to the Xilinx ZYNQ embedded system platform to reduce FPGA power consumption when idle states can be identified during system operation. The Xilinx ZYNQ platform consists of two parts: a dual-core ARM processor and an FPGA. In the proposed technique, the ARM processor controls the FPGA power supplies via the programmable voltage regulator available in the ZC702 evaluation board. The ARM processor also reconfigures the FPGA after turning on the corresponding power supplies. The experimental results show that the overhead of this technique is in the order of milliseconds which makes it suitable to be used in energy proportional computing techniques. As the PL configuration overhead is the main bottleneck in the proposed technique, one of the future goal of this research is proposing new architectures and strategies to reduce this overhead. In addition, future work will compare the proposed technique with the voltage scaling technique proposed in [3].

### ACKNOWLEDGEMENT

The authors would like to thank the reviewers for their valuable comments and especially our colleague Dr. Arash Furfadhi, whose suggestions helped us to do this research. This research is a part of the ENPOWER project sponsored by EPSRC and done in collaboration with Queen’s University Belfast.

### REFERENCES


