# Embedded System Application 4190.303C 2010 Spring Semester

# **Memory Power Management**

Naehyuck Chang Dept. of EECS/CSE Seoul National University naehyuck@snu.ac.kr



### Outline

- Concept of high-level power management
- SRAM power management
- SDRAM power states
- SDRAM mode control
- High-level memory power management





- Low-level energy optimization
  - Has been contributing over dozens of years
  - Enhancement of devices and components
  - General solution that applicable to almost all kinds of use
  - Gity bus service example
    - Objective: more gas mileage
    - New buses, engine swap, aluminum bodies, new transmissions, etc.
  - In the semiconductor world

    - MTCMOS

Gas-efficient engine

light-weight bus Light-weight bus









#### System-software-level energy optimization 0

- City bus service example 9
  - Optimal speed, engine rpm, shift position scheduling w/original hardware 9
    - Analysis of a target route
    - 9 Use of component characteristics



System-level approaches give us bigger chance to minimize energy consumption!





- Level of abstraction: engine idle gas consumption
  - Model 1: linear gas consumption per speed: g = mv
  - Model 2: counting idle gas consumption when v=0:
    - g = mv + I
  - Model 3: counting engine restarting cost

- Applicable gas saving techniques when a vehicle is temporarily parked
  - **Technique 1**: linear gas consumption model
    - No policy when a vehicle is stopping
  - **Technique 2:** Idle gas consumption
    - Stop engine whenever a vehicle is stopped
  - Technique 3: Restarting cost
    - Stop engine when stopping time is more than 2 minutes for instance

Proper energy characterization is a primary concern of quality high-level power saving approach





- Components with multiple internal states
  - Each state has different functionality
  - Each state consumes different amount of power
  - Generally power consumption corresponds to service levels
- Conventionally abstracted as power state machines
  - State diagram with
    - Power and service annotation on states
    - Power and delay annotation on edges





- Dynamic power management (DPM)
  - Reduce power according to workloads
  - Shutdown only during long idle time



aboratory



- Challenge
  - Predicting the future
- Break even time: Tbe
  - Shortest idle period for energy saving



Idle period shorter than  $T_{be}$  is useless for energy saving





- Challenge
  - Predicting the future
- Break even time: Tbe
  - Shortest idle period for energy saving



#### Idle period shorter than $T_{be}$ is useless for energy saving





- Challenge
  - Predicting the future
- Break even time: Tbe
  - Shortest idle period for energy saving



#### Idle period shorter than $T_{be}$ is useless for energy saving





- When to use power management?
  - When T<sub>BE</sub> < T<sup>avg</sup>
     idle
     idle
    - Average idle periods are long enough
    - Transition delay is short enough
    - Transition power is low enough
    - Sleep power is low enough
  - When designing system for a known workload
    - ♀ Criteria for component specification and design











- Low-power mode
  - Very low data retention current
  - Disable chip select and lower VDD
  - Recovery overhead exits

| Item                       | Symbol | Test Condit                                                             | on                           | Min | Тур      | Max      | Unit |
|----------------------------|--------|-------------------------------------------------------------------------|------------------------------|-----|----------|----------|------|
| Vcc for data retention     | VDR    | CS11)aVcc-0.2V                                                          |                              | 2.0 | 2.0 -    |          | V    |
| Data retention current     |        |                                                                         | KM681000BL<br>KM681000BL-L   | :   | 1<br>0.5 | 50<br>10 |      |
|                            | IDR    | Vcc=3.0V CS1≥Vcc-0.2V,<br>CS2≥Vcc-0.2V or CS2≤0.2V<br>Other Input+0~Vcc | KM681000BLE<br>KM681000BLE-L | :   | :        | 50<br>25 | μΑ   |
|                            |        |                                                                         | KM681000BLI<br>KM681000BLI-L | :   | :        | 50<br>25 |      |
| Data retention set-up time | tRDR   | See data retention waveform                                             |                              | 0   | -        |          | ms   |
| Recovery time              |        |                                                                         |                              | 5   | -        | -        | 1113 |





How do send the SRAM to the data retention mode







- Battery backup
  - No true non-volatile, high-performance memory device exists
    - Non-volatile
    - Balanced read and write performance
    - Battery backed SRAM in an alternative solution
- Low-cost battery backup circuit
  - No battery charging sequence
  - Diode loss
  - SRAM operating VDD is lower than other device VDD



aboratory



- Non-rechargeable battery
  - Lithium battery

    - Very low data retention current
    - Dallas Semiconductor DS1259









- Rechargeable battery
  - Precision battery charge sequence
  - No diode loss

#### **Battery Backup System Manager**







SDRAM power calculation

#### **Power Calculation**

Total Power = Core Power + I/O Power (IDD4 x VDD) + (C x f/2 x VDDQ<sup>2</sup> x number of I/Os /2)

Mobile SDRAM P = (90mA x 2.5V) + (10pf x  $\frac{100 \text{ MHz}}{2}$  x 1.8V<sup>2</sup> x 16 /2) P = 238mW

Standard SDRAM P = (150mA x 3.3V) + (10pf x  $100 \text{ MHz} \times 3.3\text{V}^2 \times 16 / 2)$ 2 P = 538mW





Power supply current (IDD) specification by the power states

| PARAMETER/CONDITION                                                                                               |                                           | SYMBOL | SDR <sup>1,3</sup> (MAX) | 128Mb Mobile<br>SDRAM <sup>2,4</sup><br>(MAX) | UNITS |
|-------------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------|--------------------------|-----------------------------------------------|-------|
| Operating Current: Active Mode;<br>Burst = 2; READ or WRITE; <sup>t</sup> RC = <sup>t</sup> RC<br>CAS latency = 3 | IDD1                                      | 120    | 150                      | mA                                            |       |
| Standby Current: Power-Down Mod<br>CKE = LOW; All banks idle                                                      | IDD2                                      | 2      | 0.350 n                  |                                               |       |
| Standby Current: Active Mode; CS#<br>CKE = HIGH; All banks active after <sup>t</sup><br>No accesses in progress   |                                           | IDD3   | 50                       | 35                                            | mA    |
| Operating Current: Burst Mode; Co<br>READ or WRITE; All banks active, C/                                          |                                           | IDD4   | 150                      | 90                                            | mA    |
| Auto Refresh Current:<br>CAS latency = 3; CKE, CS# = HIGH                                                         | <sup>t</sup> RFC = <sup>t</sup> RFC (MIN) | IDD5   | 310                      | 210                                           | mA    |
| Self Refresh Current: CKE 0.2V                                                                                    | IDD7                                      | 2      | 0.100 to 0.355           | mA                                            |       |

#### **IDD Specifications and Conditions**

NOTE: 1. Part Number MT48LC8M16A2TG-75@ 3.3V

- 2. Part Number MT48V8M16LFFC-8@2.5V
- VDD, VDDQ = +3.3V for x16 SDRAM

4. VDD = +2.5V, VDDQ = 2.5V/1.8V

5. Using TCSR from 15°C to 70°C



Detailed power states









- ✓ Low-level power management
  - The amount of current consumed is directly proportional to the self refresh rate









- Temperature compensated self refresh (TCSR)
  - In self refresh operation, power can be saved if the internal self refresh intervals can be adjusted for the ambient temperature

  - Increased temperatures cause SDRAM cells to lose a charge at a faster rate







- Partial array self refresh (PASR)
  - In self refresh operation, the refresh operation can be limited to the portion of the memory's array where data will be stored
- Deep power-down (DPD)
  - In some applications, actual data retention in the DRAM is not required most of the time
    - DRAM can incorporate DPD to turn off most or all of the on-board array voltage generators
- How are they effective?

#### **Typical Use Profile**

| Operation              | Duty Cycle<br>(Percentage of Clock Cycle) |
|------------------------|-------------------------------------------|
| Power Management Modes |                                           |
| Deep Power-Down (DPD)  | 50%                                       |
| Self Refresh (PASR)    | 30%                                       |
| Standard SDRAM Modes   | 20%                                       |





- SDRAM power down
  - In DDR2 SDRAM devices, power-down occurs when CKE is registered LOW with a DESELECT or NOP command





#### Precharge power down

If this command is a PRECHARGE (or if the device is already in the idle state)

| Precharge power-down current: All banks idle;<br>'CK = 'CK (I <sub>DD</sub> ); CKE is LOW; Other control and ad-<br>dress bus inputs are stable; Data bus inputs are float-<br>ing | I <sub>DD2P</sub> | x4, x8, x16 | 5  | 5  | 5  | mA |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|-------------|----|----|----|----|
| Precharge quiet standby current: All banks idle;                                                                                                                                   | I <sub>DD2Q</sub> | x4, x8      | 40 | 35 | 25 | mA |
| <sup>1</sup> CK = <sup>1</sup> CK (I <sub>DD</sub> ); CKE is HIGH, CS# is HIGH; Other con-<br>trol and address bus inputs are stable; Data bus in-<br>puts are floating            |                   | x16         | 50 | 35 | 25 |    |

#### Active power down

If this command is an ACTIVE (or if at least one row is already active)

| Active power-down current: All banks open;<br><sup>1</sup> CK = <sup>1</sup> CK (I <sub>DD</sub> ); CKE is LOW; Other control and ad-                                                                                                                                                                              | I <sub>DD3Pf</sub> | Fast PDN exit<br>MR12 = 0 | 30 | 25 | 20 | mA |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|---------------------------|----|----|----|----|
| dress bus inputs are stable; Data bus inputs are float-                                                                                                                                                                                                                                                            | I <sub>DD3Ps</sub> | Slow PDN exit<br>MR12 = 1 | 6  | 6  | 6  |    |
| Active standby current: All banks open;                                                                                                                                                                                                                                                                            | I <sub>DD3N</sub>  | x4, x8                    | 50 | 40 | 30 | mA |
| <sup>t</sup> CK= <sup>t</sup> CK (I <sub>DD</sub> ), <sup>t</sup> RAS = <sup>t</sup> RAS MAX (I <sub>DD</sub> ), <sup>t</sup> RP = <sup>t</sup> RP (I <sub>DD</sub> );<br>CKE is HIGH, CS# is HIGH between valid commands;<br>Other control and address bus inputs are switching;<br>Data bus inputs are switching |                    | x16                       | 55 | 40 | 30 |    |





- General Fast exit
- Slow exit

|    | AC Characterist                                 |             | tics              | -18     | 7E  | -25    | 5E  | -2     | 5   | -31    | E   | -3     | <b>i</b> | -37    | Æ   | -5     | E   |                 |        |
|----|-------------------------------------------------|-------------|-------------------|---------|-----|--------|-----|--------|-----|--------|-----|--------|----------|--------|-----|--------|-----|-----------------|--------|
|    | Paramet                                         | ter         | Symbol            | Min     | Max | Min    | Max | Min    | Max | Min    | Max | Min    | Max      | Min    | Max | Min    | Max | Units           | Notes  |
|    | Exit active<br>power-                           | MR12<br>= 0 | <sup>t</sup> XARD | 3       | -   | 2      | ~   | 2      | -   | 2      | -   | 2      |          | 2      | -   | 2      | -   | <sup>1</sup> CK | 18     |
| 13 | down to<br>READ<br>command                      | MR12<br>= 1 |                   | 10 - AL | -   | 8 - AL | -   | 8 - AL | -   | 7 - AL | -   | 7 - AL | -        | 6 - AL | -   | 6 - AL | -   | <sup>1</sup> CK | 18     |
| we | Exit prechar<br>power-dow<br>nonREAD<br>command |             | <sup>t</sup> XP   | 3       | -   | 2      | 17  | 2      | -   | 2      | -   | 2      | -        | 2      | -   | 2      | -   | чСК             | 18     |
|    | CKE MIN HI<br>LOW time                          | GH/         | 'CKE              |         |     |        |     |        |     | MIN =  |     |        |          |        |     | _      |     | <sup>1</sup> CK | 18, 44 |

| Active power-down current: All banks open;<br><sup>t</sup> CK = <sup>t</sup> CK (I <sub>DD</sub> ); CKE is LOW; Other control and ad- | I <sub>DD3Pf</sub> | Fast PDN exit<br>MR12 = 0 | 30 | 25 | 20 | mA |
|---------------------------------------------------------------------------------------------------------------------------------------|--------------------|---------------------------|----|----|----|----|
| dress bus inputs are stable; Data bus inputs are float-<br>ing                                                                        | I <sub>DD3Ps</sub> | Slow PDN exit<br>MR12 = 1 | 6  | 6  | 6  |    |





BA1

#### Implementation

#### **Extended Mode Register**

| BA0 | A12      | A11 | A10      | A9           | <b>A8</b> | A                | 7 A             | 5 A                             | 5                            | <b>A</b> 4                                  | A3              | A2          | A1                | A  |  |
|-----|----------|-----|----------|--------------|-----------|------------------|-----------------|---------------------------------|------------------------------|---------------------------------------------|-----------------|-------------|-------------------|----|--|
| 0   | Reserved |     |          |              |           |                  |                 |                                 |                              | TC                                          | SR <sup>1</sup> | F           | PASR <sup>2</sup> |    |  |
|     |          |     |          |              |           |                  |                 | -                               | -                            |                                             |                 |             | Т                 |    |  |
|     | A4       | A3  | N        | Aax.         | Cas       | e Te             | emp.            |                                 |                              |                                             |                 |             |                   |    |  |
|     | 0        | 0   |          |              | 70        | 'C               |                 |                                 |                              |                                             |                 |             |                   |    |  |
|     | 0        | 1   |          |              | 45        | C                |                 |                                 |                              |                                             |                 |             |                   |    |  |
|     | 1        | 0   |          | 15°C<br>85°C |           |                  |                 |                                 |                              |                                             |                 |             |                   |    |  |
|     | 1        | 1   | <u> </u> |              |           |                  |                 |                                 |                              |                                             |                 |             |                   |    |  |
|     |          |     |          |              |           |                  |                 |                                 |                              |                                             |                 |             | 1                 |    |  |
|     |          |     |          | 4            | 12        | A1               | A0              | Self                            | FR                           | efre                                        | sh C            | over        | age               |    |  |
|     |          |     |          |              | 0         | A1<br>0          | A0<br>0         | Ali                             | B                            | anks                                        |                 |             |                   |    |  |
|     |          |     |          |              |           |                  |                 | Ali                             | B                            | anks                                        |                 | over<br>(BA |                   | 0) |  |
|     |          |     |          | E            | 0         | 0                |                 | All<br>Bar<br>Bar               | Bi<br>nk                     | anks<br>s 0 a<br>0 (B                       | nd 1            |             | 1 = 1             |    |  |
|     |          |     |          | E            | 0         | 0<br>0<br>1      | 0 1 0 1         | All<br>Bar<br>Bar<br>Res        | Bi<br>nk<br>nk               | anks<br>s 0 a<br>0 (B<br>ved                | nd 1            | (BA         | 1 = 1             |    |  |
|     |          |     |          | E            | 0         | 0<br>1<br>1<br>0 | 0               | All<br>Bar<br>Bar<br>Res<br>Res | Bi<br>nk<br>nk               | anks<br>s 0 a<br>0 (B<br>ved<br>ved         | nd 1<br>A1 =    | (BA         | 1 = 0<br>0 = 0    |    |  |
|     |          |     |          | E            | 0         | 0<br>0<br>1      | 0 1 0 1 0 1 0 1 | All<br>Bar<br>Res<br>Res<br>Lov | Bi<br>nk<br>hk               | anks<br>s 0 a<br>0 (B<br>ved<br>ved<br>r Ha | nd 1<br>A1 =    | (BA<br>BA)  | 1 = 0<br>0 = 0    | )) |  |
|     |          |     |          | E            | 0         | 0<br>1<br>1<br>0 | 0 1 0 1         | All<br>Bar<br>Res<br>Lov<br>Lov | Bi<br>nk<br>ier<br>ier<br>we | anks<br>s 0 a<br>0 (B<br>ved<br>ved<br>r Ha | nd 1<br>A1 =    | (BA         | 1 = 0<br>0 = 0    | )) |  |

- NOTE: 1. Temperature Compensated Self Refresh
  - 2. Partial Array Self Refresh
  - 3. Row Address 2 MSB = 0
  - 4. Row Address MSB = 0
  - 5. Available on future devices. Contact factory.







### **High-Level Memory Power Management**

- SDRAM power down is effective power management
  - As far as power down decision making is appropriate
  - Break-even time should be carefully considered
  - Misprediction results in negative energy gain
- ♀ Other approaches for high-level memory system power management
  - Bus encoding
  - SDRAM mode control







#### **Commercial Data Sheets**

Power values are described for a particular operating condition

| Paramete                                                                                                                                                                                                                                                                             | r/Condition                                                                                                                                                                                                                                                                     | Symbol            | Configuration | -3   | -37E | -5E | Units |                 |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|---------------|------|------|-----|-------|-----------------|
| perating                                                                                                                                                                                                                                                                             | one bank active-precharge current:                                                                                                                                                                                                                                              | I <sub>DD0</sub>  | x4, x8        | 90   | 80   | 75  | mA    | 1               |
| CK = <sup>t</sup> CK (I <sub>DD</sub> ), <sup>t</sup> RC = <sup>t</sup> RC (I <sub>DD</sub> ), <sup>t</sup> RAS = <sup>t</sup> RAS MIN (I <sub>DD</sub> );<br>CKE is HIGH, CS# is HIGH between valid commands;<br>Address bus inputs are switching; Data bus inputs<br>are switching |                                                                                                                                                                                                                                                                                 |                   | x16           | 90   | 80   | 75  |       |                 |
| perating                                                                                                                                                                                                                                                                             | g one bank active-read-precharge                                                                                                                                                                                                                                                | I <sub>DD1</sub>  | x4, x8        | 100  | 90   | 85  | mA    | 1               |
| <sup>t</sup> CK = <sup>t</sup> CK (<br><sup>t</sup> RCD = <sup>t</sup> RC<br>valid comm                                                                                                                                                                                              | $D_{DUT} = 0mA$ ; $BL = 4$ , $CL = CL (I_{DD})$ , $AL = 0$ ;<br>$I_{DD}$ ), ${}^{t}RC = {}^{t}RC (I_{DD})$ , ${}^{t}RAS = {}^{t}RAS MIN (I_{DD})$ ,<br>$D (I_{DD})$ ; CKE is HIGH, CS# is HIGH between<br>mands; Address bus inputs are switching;<br>ern is same as $I_{DD4W}$ | -001              | x16           | 100  | 90   | 85  |       |                 |
| recharge                                                                                                                                                                                                                                                                             | power-down current: All banks idle;                                                                                                                                                                                                                                             | I <sub>DD2P</sub> | x4, x8, x16   | 5    | 5    | 5   | mA    | 1               |
|                                                                                                                                                                                                                                                                                      | I <sub>DD</sub> Parameters                                                                                                                                                                                                                                                      |                   | -3            | -37E |      | -5E |       | Units           |
| ng                                                                                                                                                                                                                                                                                   | CL (I <sub>DD</sub> )                                                                                                                                                                                                                                                           |                   | 5             | 4    |      | 3   |       | <sup>4</sup> CK |
| recharg                                                                                                                                                                                                                                                                              | tRCD (IDD)                                                                                                                                                                                                                                                                      |                   | 15            | 15   |      | 15  |       | ns              |
| $CK = {}^{t}CK$<br>rol and a                                                                                                                                                                                                                                                         | <sup>t</sup> RC (I <sub>DD</sub> )                                                                                                                                                                                                                                              |                   | 60            | 60   |      | 55  |       | ns              |
| or and a                                                                                                                                                                                                                                                                             | <sup>t</sup> RRD (I <sub>DD</sub> ) - x4/x8 (1KB)                                                                                                                                                                                                                               |                   | 7.5           | 7.5  |      | 7.5 |       | ns              |
|                                                                                                                                                                                                                                                                                      | <sup>t</sup> RRD (I <sub>DD</sub> ) - x16 (2KB)                                                                                                                                                                                                                                 |                   | 10            | 10   |      | 10  |       | ns              |
|                                                                                                                                                                                                                                                                                      | <sup>t</sup> CK (l <sub>DD</sub> )                                                                                                                                                                                                                                              |                   | 3             | 3.75 |      | 5   |       | ns              |





#### **Micron Power Calculator**

- Three steps
  - Power subcomponents are calculated based on data sheet specifications
  - Power is derated based on the command scheduling in the system
  - Power is derated to the system's actual operating VDD and clock frequency
- Power derating
  - Power calculations have assumed a system operating at worst-case VDD (Psch)
  - Clock frequency in the system is the same as the frequency defined in the data sheet (Psys)
  - However, most systems operate at different clock frequencies or operating voltages than the ones defined in the data sheet
- VDD derating

$$Psys(XXX) = Psch(XXX) \times (\frac{use VDD}{Max spec VDD})^2$$

Clock frequency derating

$$Psys(XXX) = Psch(XXX) \times \frac{use\_freq}{spec\_freq}$$





#### **Micron Power Calculator**

Some parameters are frequency dependent while others are not

$$Psys(PRE\_PDN) = Psch(PRE\_PDN) \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(ACT\_PDN) = Psch(ACT\_PDN) \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(PRE\_STBY) = Psch(PRE\_STBY) \left[\frac{use \ freq}{spec\_freq}\right] \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(ACT\_STBY) = Psch(ACT\_STBY) \left[\frac{use \ freq}{spec\_freq}\right] \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(ACT) = Psch(ACT) \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(WR) = Psch(WR) \times \left[\frac{use \ freq}{spec\_freq}\right] \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(RD) = Psch(WRRD) \times \left[\frac{use \ freq}{spec\_freq}\right] \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$

$$Psys(REF) = Psch(REF) \times \left[\frac{use \ VDD}{Max \ spec \ VDD}\right]^{2}$$





- Energy state machine
  - Energy state machine is a finite state machine
  - States denote static power consumption
  - Transitions denote dynamic power consumption



(a) asynchronous energy state machine



(b) synchronous energy state machine





Energy state machine (Memory device)

- SRAM
- DRAM
- SDRAM
- NAND Flash
- etc.







Annotation of energy state machine (Event-accurate energy measurement)







- Annotation of energy state machine (Leakage energy consumption triggered by an asynchronous strobe signal)
  - Leakage energy consumption is denoted by







### **Memory Bus Encoding**

Bus encoding concept



#### Bus power cost

- Power cost
  - HDD: Hamming distance dependent power
  - WDS: Weight dependent static power
- HDD
  - Due to the switching capacitance of the load and PCB track
- - Due to the static power of the bus driver and termination resistors







- Clock frequencies of processor and memory system doesn't affect the number of transactions
- Secution time decreases as clock frequency increases

| Addr   | ess bus         | Data bus        |            |  |
|--------|-----------------|-----------------|------------|--|
| IDD    | WDS             | HDD             | WDS        |  |
| 7/4.46 | 11.79/41.36     | 4.13/14.48      | 1.05/3.67  |  |
| 7/5.71 | 9.20/41.32      | 4.13/18.54      | 1.05/4.70  |  |
| 7/7.97 | 6.58/41.24      | 4.13/25.86      | 1.05/6.55  |  |
| 7/9.90 | 5.29/41.17      | 4.13/32.13      | 1.05/8.14  |  |
| /13.06 | 4.00/41.04      | 4.13/42.37      | 1.05/10.74 |  |
| /19.03 | 2.73/40.78      | 4.13/61.73      | 1.05/15.64 |  |
| KB/4wo | rd/2-way-set-as | ssociative cach | e          |  |
| 7/7.75 | 6.72/40.94      | 4.13/25.16      | 2.09/12.75 |  |
| 7/9.90 | 5.29/41.17      | 4.13/32.13      | 1.05/8.14  |  |
| /10.50 | 5.00/41.23      | 4.13/34.05      | 0.84/6.90  |  |
| /10.68 | 4.92/41.31      | 4.13/34.63      | 0.70/5.85  |  |
| /11.22 | 4.69/41.35      | 4.13/36.41      | 0.52/4.61  |  |
| /1     | 1.22            | 1.22 4.69/41.35 |            |  |





Static power consumption







Power consumption coefficients





37

Embedded Low-Power

aboratory

- Bus invert coding
  - Original bus signal + INV signal
  - If the next bus power cost is high, invert the bus data and enable INV signal



- Example
  - $\bigcirc$  00000000, 11111111 → 00000000 (0), 00000000 (1)
- Inversion decision is the key
  - ♀ HDD based, WDS based, or both







HDD and WDS based inversion decision contract with each other 0





- In the view of energy consumption of the data bus

  - Hamming-distance-based bus-invert coding scheme is superior for JPEG compressor, JPEG decompressor and MPEG4 decoder

| Application | Decision scheme | HDD  | WDS  | Total | Reduction ratio (%) |
|-------------|-----------------|------|------|-------|---------------------|
| MP3         | no encoding     | 0.18 | 0.40 | 0.58  | 0.0                 |
|             | H-based         | 0.13 | 0.20 | 0.33  | 43.9                |
|             | W-based         | 0.16 | 0.02 | 0.18  | 69.9                |
| CJPEG       | no encoding     | 4.13 | 1.05 | 5.18  | 0.0                 |
|             | H-based         | 3.37 | 1.01 | 4.38  | 15.4                |
|             | W-based         | 4.28 | 0.57 | 4.85  | 6.4                 |
| DJPEG       | no encoding     | 4.15 | 1.17 | 5.32  | 0.0                 |
|             | H-based         | 3.37 | 1.02 | 4.39  | 17.5                |
|             | W-based         | 4.25 | 0.53 | 4.78  | 10.0                |
| MPEG4       | no encoding     | 2.50 | 0.85 | 3.35  | 0.0                 |
|             | H-based         | 1.95 | 0.80 | 2.75  | 17.8                |
|             | W-based         | 2.58 | 0.44 | 3.02  | 9.9                 |
|             |                 |      |      |       |                     |





# **Energy Reduction Practices**

Larger cache block size increases the HDD energy of the data bus, but decreases the HDD energy of the address bus

|                  | 2000          | Addro                                                                                                           | ess bus      | Data            | bus         |
|------------------|---------------|-----------------------------------------------------------------------------------------------------------------|--------------|-----------------|-------------|
| Paramo           | Parameter HDI |                                                                                                                 | WDS          | HDD             | WDS         |
|                  | 1             | 5.78/22.44                                                                                                      | 10.09/39.18  | 16.44/63.83     | 4.48/17.40  |
| Cache            | 2             | 4.63/20.55                                                                                                      | 8.82/39.20   | 13.70/60.84     | 3.41/15.13  |
| size             | 4             | 3.91/19.01                                                                                                      | 8.05/39.15   | 12.14/59.05     | 2.73/13.29  |
| KB               | 8             | 1.27/9.90                                                                                                       | 5.29/41.17   | 4.13/32.13      | 1.05/8.14   |
|                  | 16            | 0.62/5.59                                                                                                       | 4.65/42.03   | 2.17/19.61      | 0.53/4.79   |
| f <sub>M</sub> @ | 66MI          | Hz, f <sub>P</sub> @266N                                                                                        | Hz, 4word/2- | way-set-associa | ative cache |
|                  | 1             | 1.30/10.04                                                                                                      | 5.31/41.15   | 4.11/31.86      | 1.14/8.86   |
| Cache            | 2             | 1.27/9.90                                                                                                       | 5.29/41.17   | 4.13/32.13      | 1.05/8.14   |
| set              | 4             | 1.36/10.36                                                                                                      | 5.40/41.13   | 4.45/33.90      | 1.09/8.27   |
|                  | 8             | 1.47/10.87                                                                                                      | 5.60/41.46   | 4.81/35.63      | 1.15/8.52   |
|                  | fм            | @66MHz, $f_F$                                                                                                   | @266MHz, 8   | KB/4word cacl   | he          |
| Dlask            | 4             | 1.27/9.90                                                                                                       | 5.29/41.17   | 4.13/32.13      | 1.05/8.14   |
| Block            |               | the second se |              | 5.15/42.79      | 1.48/12.28  |





Auto-precharge



After a burst-mode access, the controller closes the row, and the SDRAM remains in the idle mode







Active page



- The SDRAM may also remain in the row-active state after a burst mode access
- If the row address of the next access is equal to the current one (row hit), the controller does not need to re-issue the row address, and thus data can be directly forwarded to the sense amplifier
- If the next access refers the different row address (row miss) while the SDRAM remains in the rowactive mode expecting the next access will hit the same row, the controller need to close the row and re-open a new row







- Auto-precharge policy for a mid-performance system
  - General CL: common-mode leakage power
  - GD: common-mode dynamic power

|      | CL                                                               | CD                                                                                                                                                                                                                                                                  | WDD                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Total                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 100  | 19.0/66.6                                                        | 65.8/230.8                                                                                                                                                                                                                                                          | 2.3/8.1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 87.1/305.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 133  | 15.3/68.9                                                        | 63.5/285.2                                                                                                                                                                                                                                                          | 2.3/10.4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 81.2/364.4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 200  | 11.7/73.0                                                        | 61.2/383.1                                                                                                                                                                                                                                                          | 2.3/14.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 75.1/470.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 266  | 9.8/76.5                                                         | 60.0/467.1                                                                                                                                                                                                                                                          | 2.3/18.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 72.2/561.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 400  | 8.0/82.3                                                         | 58.9/604.3                                                                                                                                                                                                                                                          | 2.3/23.7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 69.2/710.3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 800  | 6.2/93.2                                                         | 57.8/863.6                                                                                                                                                                                                                                                          | 2.3/34.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 66.3/991.3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| @66M | Hz, 8KB/4wo                                                      | ord/2-way-set-a                                                                                                                                                                                                                                                     | issociative ca                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | che                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| 33   | 14.2/86.8                                                        | 61.1/372.3                                                                                                                                                                                                                                                          | 2.3/14.1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 77.7/473.2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 66   | 9.8/76.5                                                         | 60.0/467.1                                                                                                                                                                                                                                                          | 2.3/18.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 72.2/561.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 83   | 8.9/73.8                                                         | 59.8/493.3                                                                                                                                                                                                                                                          | 2.3/19.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 71.1/586.1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 100  | 8.8/73.7                                                         | 59.7/501.1                                                                                                                                                                                                                                                          | 2.3/19.4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 70.8/594.2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 133  | 8.0/70.5                                                         | 59.6/525.4                                                                                                                                                                                                                                                          | 2.3/20.4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 69.9/616.2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|      | 100<br>133<br>200<br>266<br>400<br>800<br>@66M<br>33<br>66<br>83 | 100       19.0/66.6         133       15.3/68.9         200       11.7/73.0         266       9.8/76.5         400       8.0/82.3         800       6.2/93.2         @66MHz, 8KB/4we         33       14.2/86.8         66       9.8/76.5         83       8.9/73.8 | 100         19.0/66.6         65.8/230.8           133         15.3/68.9         63.5/285.2           200         11.7/73.0         61.2/383.1           266         9.8/76.5         60.0/467.1           400         8.0/82.3         58.9/604.3           800         6.2/93.2         57.8/863.6           @66MHz, 8KB/4word/2-way-set-a         33         14.2/86.8         61.1/372.3           66         9.8/76.5         60.0/467.1         83         8.9/73.8         59.8/493.3 | 100         19.0/66.6         65.8/230.8         2.3/8.1           133         15.3/68.9         63.5/285.2         2.3/10.4           200         11.7/73.0         61.2/383.1         2.3/14.5           266         9.8/76.5         60.0/467.1         2.3/18.0           400         8.0/82.3         58.9/604.3         2.3/23.7           800         6.2/93.2         57.8/863.6         2.3/34.5           @66MHz, 8KB/4word/2-way-set-associative ca         33         14.2/86.8         61.1/372.3         2.3/14.1           66         9.8/76.5         60.0/467.1         2.3/18.0           83         8.9/73.8         59.8/493.3         2.3/14.0 |





- Auto-precharge policy for a mid-performance system
  - GL: common-mode leakage power
  - CD: common-mode dynamic power

| Paramete           | er          | CL                     | CD                       | WDD                  | Total                                  |
|--------------------|-------------|------------------------|--------------------------|----------------------|----------------------------------------|
|                    | 1           | 24.7/96.1              | 275.8/1070.9             | 10.8/41.8            | 311.3/1208.7                           |
| Cache              | 2           | 20.9/93.0              | 222.9/990.0              | 9.0/40.1             | 252.9/1123.1                           |
| size               | 4           | 18.6/90.6              | 191.4/930.7              | 8.0/39.0             | 218.1/1060.4                           |
| KB                 | 8           | 9.8/76.5               | 60.0/467.1               | 2.3/18.0             | 72.2/561.6                             |
|                    | 16          | 7.6/69.1               | 30.0/270.9               | 1.1/9.8              | 38.7/349.8                             |
| f <sub>M</sub> @60 | 6MHz, j     | f <sub>P</sub> @266MH  | z, 4word/2-way-          | set-associati        | ve cache                               |
|                    | 1           | 9.9/77.0               | 60.6/469.7               | 2.2/17.2             | 72.8/563.9                             |
|                    |             |                        |                          |                      | 1210120215                             |
| Cache              | 2           | 9.8/76.5               | 60.0/467.1               | 2.3/18.0             | 72.2/561.6                             |
| Cache<br>set       | -           | 9.8/76.5<br>10.1/77.2  | 60.0/467.1<br>64.9/494.3 | 2.3/18.0<br>2.5/19.3 |                                        |
| 5.0100             | 2           |                        |                          |                      | 72.2/561.6                             |
| Cache<br>set       | 2<br>4<br>8 | 10.1/77.2<br>10.5/78.1 | 64.9/494.3               | 2.5/19.3<br>2.8/20.8 | 72.2/561.6<br>77.6/590.8<br>84.2/623.4 |
| 50000              | 2<br>4<br>8 | 10.1/77.2<br>10.5/78.1 | 64.9/494.3<br>70.8/524.5 | 2.5/19.3<br>2.8/20.8 | 72.2/561.6<br>77.6/590.8<br>84.2/623.4 |





- Auto-precharge policy for a high-performance system
  - GL: common-mode leakage power
  - CD: common-mode dynamic power

| Parame                      | ter   | CL           | CD              | WDD            | Total      |
|-----------------------------|-------|--------------|-----------------|----------------|------------|
|                             | 133   | 27.0/133.6   | 25.7/126.9      | 1.1/5.2        | 55.1/272.5 |
| Processor                   | 200   | 19.6/140.6   | 26.4/182.8      | 1.0/7.5        | 47.1/337.6 |
| $\operatorname{clock}(f_P)$ | 266   | 15.7/145.5   | 26.1/236.2      | 1.0/9.7        | 42.8/398.1 |
| MHz                         | 400   | 11.7/153.6   | 25.8/331.7      | 1.0/13.7       | 38.6/505.6 |
|                             | 800   | 7.3/163.3    | 25.5/560.9      | 1.0/23.2       | 33.9/753.7 |
|                             | 1000  | 6.4/165.8    | 25.4/651.0      | 1.0/26.9       | 32.9/850.0 |
| fм                          | @100N | 4Hz, 16KB/4w | vord/2-way-set- | associative of | ache       |
|                             | 33    | 15.3/155.6   | 26.0/258.4      | 1.0/10.6       | 42.3/430.9 |
| Memory                      | 66    | 12.5/155.1   | 25.8/315.1      | 1.0/13.0       | 39.3/489.6 |
| $\operatorname{clock}(f_M)$ | 83    | 11.8/154.2   | 25.8/330.2      | 1.0/13.6       | 38.7/504.6 |
| MHz                         | 100   | 11.7/153.6   | 25.8/331.7      | 1.0/13.7       | 38.6/505.6 |
| MHz                         |       |              |                 | 1.0/14.3       | 38.0/521.0 |





- Auto-precharge policy for a high-performance system
  - General CL: common-mode leakage power
  - GD: common-mode dynamic power

| Parame       | eter        | CL                                     | CD                                     | WDD                              | Total                                  |  |
|--------------|-------------|----------------------------------------|----------------------------------------|----------------------------------|----------------------------------------|--|
| Cache        | 1           | 31.0/172.2                             | 240.9/1332.2                           | 9.7/54.0                         | 281.6/1564.9                           |  |
| size         | 2           | 26.9/172.1                             | 194.0/1233.3                           | 8.1/52.0                         | 229.1/1464.0                           |  |
| KB           | 4           | 24.4/170.3                             | 166.9/1158.3                           | 7.3/50.7                         | 198.6/1385.8                           |  |
|              | 8           | 14.9/164.9                             | 53.5/585.9                             | 2.2/24.2                         | 70.6/781.5                             |  |
|              | 16          | 11.7/153.6                             | 25.8/331.7                             | 1.0/13.7                         | 38.6/505.6                             |  |
| $f_M @ 1$    | 00MHz       | , f <sub>P</sub> @400MH                | z, 4word/2-way-                        | -set-associa                     | tive cache                             |  |
|              | -           | 1                                      |                                        |                                  |                                        |  |
| Cache        | 1           | 12.7/163.9                             | 28.3/358.6                             | 1.1/14.5                         | 42.1/543.5                             |  |
| Cache<br>set | 1 2         | 12.7/163.9<br>11.7/153.6               | 28.3/358.6<br>25.8/331.7               | 1.1/14.5<br>1.0/13.7             | 42.1/543.5<br>38.6/505.6               |  |
|              |             |                                        |                                        |                                  |                                        |  |
|              | 2           | 11.7/153.6                             | 25.8/331.7                             | 1.0/13.7                         | 38.6/505.6                             |  |
|              | 2<br>4<br>8 | 11.7/153.6<br>11.3/149.5<br>11.1/147.2 | 25.8/331.7<br>25.2/325.0               | 1.0/13.7<br>1.0/13.4<br>1.0/13.0 | 38.6/505.6<br>37.5/494.5<br>36.0/478.6 |  |
|              | 2<br>4<br>8 | 11.7/153.6<br>11.3/149.5<br>11.1/147.2 | 25.8/331.7<br>25.2/325.0<br>24.0/311.8 | 1.0/13.7<br>1.0/13.4<br>1.0/13.0 | 38.6/505.6<br>37.5/494.5<br>36.0/478.6 |  |





Conventional CRB (Column, Row, Bank) address alignment







GBR (Column, Bank, Row) address alignment for higher row-hit ratio





#### High performance configuration **row-hit ratio** versus **execution time**







#### High performance configuration **row-hit** ratio versus **energy consumption**







### SDRAM mode control

- With elaborated bus encoding scheme, we reduce only around 1% out of total energy of SDRAM devices
  - HDD energy is not observed in SDRAM devices
  - Actual portion of WDD energy is very small
- SDRAM mode control schemes are introduced
  - Forcing SDRAM devices to active (high energy state) or idle mode (low energy state)
  - Shutting down SDRAM devices
- The first mode control scheme requires correct estimation of row hit behavior
- The second mode control scheme requires proper **break-even** time for shutting down the devices







- High-performance configuration idle clocks between successive memory operations
  - Cache hit ratio determines the vertical scale of the spectrum
  - The processor clock frequency determines the horizontal scale
  - As cache-hit ratio increases, new spectrums appear at the right side of the graph



frequency (thousands)



mbedded Low-Power

aboratorv

High-performance configuration idle clocks between successive memory operations or 0 successive row hits with CRB and CBR





- Commercial SDRAM controllers commonly set up the time-out value for the delayed precharge by 256 clock steps or do not have a capability to set up the value
  - But the dominant row hit spectrum locates at 3 clocks which is a desirable time-out value
  - When we adopt CBR alignment, the magnitude of the dominant row hit spectrum at 3 clocks becomes larger than that of CBR alignment, and a new row hit spectrum appears at 48 clocks
  - But it is not a good idea to set the time-out value to 48 clocks, because it results in more energy consumption due to large CL energy in the active mode
  - The optimal time-out values
    - MP3 decoder: 3 clocks
    - JPEG compressor, JPEG decompressor: 4 clocks, 4 clocks
    - MPEG4 decoder: 10 clocks







- Shutting down the SDRAM is conducted by the idle time distribution
  - It is important to count in both dynamic cost and static cost in calculating mode control energy overhead
  - ♀ For I → IPD → I (when using idle-mode power down), energy cost is  $(0.0038 + 1.6 \cdot 10^{-4}n)\tau$ , where n is the dwell time in state IPD in clock cycles







### Energy reduction of SDRAM devices

- Mid-performance configuration with auto-precharge policy
  - In most cases, auto-precharge policy achieves slightly better performance enhancement and lower energy consumption than active-page policy with conventional CRB address alignment
  - But, active-page policy with CBR address alignment achieves 2.9% performance enhancement on the average
  - When we use given delayed precharge policy with CBR alignment and idle-mode power down, we achieve 1.3% performance enhancement and 10.9% energy reduction on the average

| App.  | Reduction technique                 | CL   | CD   | WDD | Total | Reduction ratio (%) | Execution time (ms) |
|-------|-------------------------------------|------|------|-----|-------|---------------------|---------------------|
| CJPEG | active page (CRB)                   | 21.0 | 56.5 | 2.2 | 79.7  | -11.8               | 130.5               |
|       | active page (CBR)                   | 20.3 | 50.0 | 1.9 | 72.3  | -1.4                | 123.5               |
|       | auto precharge                      | 9.8  | 59.2 | 2.3 | 71.3  | 0.0                 | 128.5               |
|       | auto precharge, power down          | 4.9  | 59.2 | 2.3 | 66.4  | 6.9                 | 131.0               |
|       | delayed precharge (CBR)             | 12.6 | 54.6 | 2.0 | 69.2  | 3.0                 | 125.0               |
|       | delayed precharge (CBR), power down | 8.4  | 54.6 | 2.0 | 65.0  | 8.9                 | 126.2               |

#### Energy consumption (mJ), 24M instructions





- High-performance configuration with active-page policy
  - Active-page policy with given CBR alignment not only consumes less energy but also enhances the performance than active-page policy with conventional CRB alignment
  - When we use given delayed precharge policy with CBR alignment and idle-mode power down, we achieve 2.4% performance enhancement and 25.1% energy reduction on the average
  - In addition, active-page policy with conventional CRB alignment does not increase the performance remarkably comparing with the other reduction techniques

| App.  | Reduction technique                 | CL   | CD   | WDD | Total | Reduction ratio (%) | Execution time (ms) |
|-------|-------------------------------------|------|------|-----|-------|---------------------|---------------------|
| CJPEG | active page (CRB)                   | 11.7 | 25.8 | 1.0 | 38.6  | 0.0                 | 76.3                |
|       | active page (CBR)                   | 11.5 | 23.4 | 0.9 | 35.8  | 7.3                 | 73.6                |
|       | auto precharge                      | 5.3  | 26.7 | 1.1 | 33.1  | 14.2                | 75.1                |
|       | auto precharge, power down          | 2.1  | 26.7 | 1.1 | 29.8  | 22.7                | 75.9                |
|       | delayed precharge (CBR)             | 6.0  | 26.0 | 1.0 | 33.1  | 14.3                | 74.1                |
|       | delayed precharge (CBR), power down | 3.0  | 26.0 | 1.0 | 30.0  | 22.2                | 74.7                |

Energy consumption (mJ), 24M instructions





- Low-power SDRAM main memory system
  - The energy reduction schemes for memory buses and devices are orthogonal (independent)
  - Mid-performance configuration

| Application | Original | Final | Reduction (%) |
|-------------|----------|-------|---------------|
| MP3         | 26.8     | 16.0  | 40.2          |
| CJPEG       | 83.0     | 71.0  | 14.5          |
| DJPEG       | 80.1     | 68.8  | 14.1          |
| MPEG4       | 62.7     | 53.3  | 15.0          |

Energy consumption (mJ), 24M instructions

High-performance configuration

| Application | Original | Final | Reduction (%) |
|-------------|----------|-------|---------------|
| MP3         | 21.3     | 11.6  | 45.5          |
| CJPEG       | 44.9     | 32.8  | 26.8          |
| DJPEG       | 47.2     | 34.6  | 26.6          |
| MPEG4       | 50.4     | 39.0  | 22.7          |

Energy consumption (mJ), 24M instructions



