power_tool
Toolkit for fine-grained energy consumption measurements.
GENERAL
The power_tool
toolkit makes use of Intel RAPL and CPU performance counters
to measure the average energy consumption of:
- single CPU operations
- single RAM access
Additionally the average energy consumption of the following components may be quantified by involving an external power measurement device:
- single network frames (1500 Byte)
- reads and writes to the HDD
The CPU and RAM tests require a CPU with Intel RAPL support and the RAPL sysfs
driver to be installed. It may be needed to adjust the
kernel.perf_event_paranoid
sysctl setting.
The network test uses RAPL for more precise results. For the HDD and network
tests, data is obtained from an external energy measurement device.
START-UP
Before startup the user has to ensure that:
- other background processes are closed as far as possible
- the CPU clock rate is fixed by selecting a suitable governor (e.g. with
cpufreq
)
The tool checks for the following prerequisites on the system before the respective test can be performed:
- Perf: syscall to read CPU performance counter (needed for CPU, RAM, network tests)
- Intel RAPL (needed for CPU, RAM, network tests)
- core: power consumption of all cores on the CPU die (needed for CPU tests)
- uncore: power consumption of shared componentes on the CPU die (needed for CPU tests)
- dram: power consumption of the DRAM (needed for RAM test)
- Filesystem: Ramdisks are not supported (for HDD test)
- free space: check for enough free space on HDD (for HDD test)
Furthermore the tool checks the CPU load before performing tests. If the standard deviation of the power consumption varies for more than 0.50 Watt in 10 measurements, the idle check is performed again until the system load is stable.
// power_tool.c:
#define SHORT_IDLE_FUNC_CYCLES 10 // total 10 sec
#define SHORT_IDLE_FUNC_MAX_SD 0.50 // standard deviation
CPU-TEST
NOTE: The CPU test requires Intel RAPL and the perf_event_open()
syscall.
This test runs several testcases to measure the energy consumption of some of the most common CPU instructions.
Operation | Description |
---|---|
add | addition of values |
sub | subtract two values |
comp | compare two values |
div | divide values |
mul | multiply values |
fpadd | addition of floating point values |
fpmul | multiplication of floating point values |
mov | move values from cache |
mov (with cache misses) | move values from ram |
Output:
As a result the CPU idle power input, the average energy consumption for each instruction, the elapsed time, the total number of instructions and the number of cache misses is displayed.
NOTE: The comparability among some operation benchmarks might be reduced.
TODO: Adjust the tests, so that alike behavior can be expected. This involves ensuring the same program flow as well as considering different operator dependent specifications. For example: Integer multiplication may overflow but continue as before, whereas floating-point multiplication can result in
NaN
orInf
after which the CPU work in the benchmark loop might change.
TODO: Run each test loop multiple times to calculate the standard deviation of the sampled average. Also a histogram of the distribution can be interesting.
NETWORK-TEST
NOTE: This test requires an external power measurement device
The network test calculates the average energy needed for one sent ethernet frame. It's execution is tweaked to last around 60 seconds. Before this test is started, the user has to provide the idle power consumption. During the test the average consumption under load needs to be obtained from an external measurement device.
This test uses the full output capacity of the network interface to send random UDP packets to a private and (hopefully) unused IP adress for about 60 seconds.
// power_tool.c:
#define SOME_IP "192.168.23.123" /* private address space RFC1918 */
Output:
As a result the total number of sent frames, the average power consumption, the average power input with RAPL package raise, the average energy consumption per frame with and without RAPL package raise and the elapsed time is displayed.
RAM-TEST
NOTE: This test needs Intel RAPL, the system call
perf_event_open()
and 128MB of free RAM
The RAM test calculates the average energy consumption for one RAM access. An array sized 128MB is created and accessed with read and write operations in a manner that the number of CPU cache misses is maximized.
Output:
As a result the ram idle power usage before the test, the energy consumption for an average RAM access with and without the previously measured idle consumption, the elapsed time and the total number of cache misses is displayed.
HDD-Test
NOTE: This test requires at least 8GiB of free space on the HDD
The HDD test creates a 4GiB files in /var/tmp for sequential read and write
operations. It is filled with previously generated random data using O_SYNC
and O_DIRECT
to bypass the OS cache. Inbetween the read and write, another
4GiB file created and filled with the hope of spamming the HDD cache as this
cache is not affected by O_DIRECT
. This mechanism is used to mimic typical
HDD use cases.
// power_tool.c:
#define MIB 1024
#define CYCLES 4
Output:
As a result the total time, the written/read data, the average write/read rate and the average energy consumption is displayed.
USAGE
./power_tool [all | net | hdd | cpu | ram]
all: This options runs all available tests that are possible on the system.
net | hdd | cpu | ram:
Only the chosen test is performed if possible on the system.
The HDD and CPU tests are interactive and require user input. The test report is displayed and written to a text file in the directory.
ADDITIONAL NOTES
All tests performed by power_tool
are extremly sensitive to power variations of the system.
Some mechanisms are in place to detect and subtract strong fluctuations before and during
the tests.
Nevertheless it's extremely important to stop all processes that might interfere with
the measurements of this program! Otherwise the results might be useless!
Take all results with a grain of salt.