Student: Thomas E. Hansen (teh6, 150015673)
Supervisor: Dr. John Thomson
I personally found it helps to have a Python 2 virtual environment running when
setting up gem5. The requirements for it can be installed from the
venv-reqs-gem5.txt
file.
Clone and build the gem5 Simulator. Then, copy the files from the
gem5-custom-config-scripts
to the gem5/configs/example/arm/
directory.
$ cd gem5
$ export M5_PATH=path/to/linux/files
Both the commands below can further be customised by the flags:
--big-cpus N
--little-cpus N
--cpu-type=<cpu-type>
Full system simulation without power:
$ ./build/ARM/gem5.opt configs/example/arm/fs_bL_extended.py \
--caches
--kernel=$M5_PATH/binaries/<kernel-name> \
--disk=$M5_PATH/disks/<disk-image-name>.img \
--bootloader=$M5_PATH/binaries/<bootloader> \
--bootscript=path/to/bootscript.rcS
Full system simulation with power:
$ ./build/ARM/gem5.opt configs/example/arm/fs_bL_extended.py \
--caches \
--kernel=$M5_PATH/binaries/<kernel-name> \
--disk=$M5_PATH/disks/<disk-image-name>.img \
--bootloader=$M5_PATH/binaries/<bootloader> \
--bootscript=path/to/bootscript.rcS \
--example-power
Since the complete data for this project totalled 120GB in size, it is not
included here. However, in the extracted-data
directory, there are two files:
roi-out.csv
and roi-out_cfg-totpow.csv
. These files contain the data
matching several PMU events and were constructed using the data-aggregate.py
script. Both files should theoretically work as inputs to the scripts, but the
roi-out_cfg-totpow.csv
file (which contains configs and the total power, in
addition to the stats found in roi-out.csv) is probably safer to use with most
of the scripts.
Optionally, create a Python 3 virtualenv and activate it.
Install the requirements found in venv-reqs-dataproc.txt
.
Each of the scripts use argparse
and so should provide a usage message. Please
refer to this for detailed usage instructions.
Create a Python 2 virtualenv and install the requirements found in
venv-reqs-gemstone-applypower.txt
cd
into gemstone-applypower
and activate the venv
For simulating Cortex A15
$ ./gemstone_create_equation.py -p models/gs-A15.params -m maps/gem5-A15.map -o gem5-A15
For simulating Cortex A7
$ ./gemstone_create_equation.py -p models/gs-A7.params -m maps/gem5-A7.map -o gem5-A7
- power: Fix regStats for PowerModel and PowerModelState
- sim-power: Fix power model to work with stat groups
Applied using git-cherry-pick
A working index can be found on the old m5sim page. These files should then be retrieved from dist.gem5.org/dist/current/arm/
- Create a new file of (in this case, 1024B*1024 = 1GiB) zeros using
(you may need to be root or use
$ dd if=/dev/zero of=path/to/file.img bs=1024 count=1024
sudo
for the next couple of steps) - Find the next available loopback device
$ losetup -f
- Set up the device returned (e.g.
/dev/loop0
) with the image file at offset 32256 (63 * 512 bytes; something to do with tracks, see this)$ losetup -o 32256 /dev/loop0 path/to/file.img
- Format the device
$ mke2fs /dev/loop0
- Detach the loopback device
$ losetup -d /dev/loop0
Done. The image can now be mounted and manipulated using
$ mount -o loop,offset=32256 path/to/file.img path/to/mountpoint
*IMPORTANT: remember to copy the GNU/NIX binaries necessary for the system you'll be emulating to their appropriate locations on the new disk
Some details about what to do next can be found here:
The gem5 devs/website openly admits that the DVFS documentation is outdated, leading the user to having to manually read their way through source code and example config scripts to try to figure out how to construct the relevant components. This my attempt at documenting and understanding how it works.
Voltage Domains dictate the voltage values the system can use. It seems gem5
always simulates voltage in FS mode, but simply sets it to 1.0V if the user does
not care about voltage simulation (see src/sim/VoltageDomain.py
)
To create a voltage domain, either a voltage value or a list of voltage values
must be given. But not just to the VoltageDomain
constructor, no that would
be too simple, but instead as a keyword-argument (kwarg), i.e. voltage
. To my
knowledge, this is not documented anywhere, nor is it easily discoverable from
the src/sim/{VoltageDomain.py, voltage_domain.hh, voltage_domain.cc}
files.
The example voltage domains I've used are (note that the values have to be specified in descending order):
For the big cluster:
odroid_n2_voltages = [ '0.981000V'
, '0.891000V'
, '0.861000V'
, '0.821000V'
, '0.791000V'
, '0.771000V'
, '0.771000V'
, '0.751000V'
]
odroid_n2_voltage_domain = VoltageDomain(voltage=odroid_n2_voltages)
For the LITTLE cluster:
odroid_n2_voltages = [ '0.981000V'
, '0.861000V'
, '0.831000V'
, '0.791000V'
, '0.761000V'
, '0.731000V'
, '0.731000V'
, '0.731000V'
]
odroid_n2_voltage_domain = VoltageDomain(voltage=odroid_n2_voltages)
These numbers were obtained by examining the changes in the sysfs files
/sys/class/regulator/regulator.{1,2}/microvolts
when using the userspace
frequency governor and varying the frequency of the big and LITTLE clusters
(respectively) using the cpupower
command-line tool.
NOTE: In gem5 (and, as far as I know, on real hardware) voltage domains
apply to CPU sockets. So make sure that the big and LITTLE clusters in the
simulator are on different sockets if they need to have different voltage
domains (you can inspect the socket through the socket_id
value associated
with the clusters)
Clock domains dictate what frequencies the CPU(s) can be clocked at (what steps are available for the DVFS handler) and are associated with a Voltage Domain. I am uncertain as to what precisely the requirements are for the relationship between these two, especially as the constructor does not seem to complain if there is a different number of values in the available clocks and voltages.
I obtained the following clock values from the Odroid N2 board using the
cpupower
command-line tool:
For the big cluster:
odroid_n2_clocks = [ '1800MHz'
, '1700MHz'
, '1610MHz'
, '1510MHz'
, '1400MHz'
, '1200MHz'
, '1000MHz'
, '667MHz'
]
odroid_n2_clk_domain = SrcClockDomain(clock=odroid_n2_clocks,
voltage_domain=odroid_n2_voltage_domain
)
For the LITTLE cluster:
odroid_n2_clocks = [ '1900MHz'
, '1700MHz'
, '1610MHz'
, '1510MHz'
, '1400MHz'
, '1200MHz'
, '1000MHz'
, '667MHz'
]
odroid_n2_clk_domain = SrcClockDomain(clock=odroid_n2_clocks,
voltage_domain=odroid_n2_voltage_domain
)
The statements below, whilst possibly correct, seem to go against the way things
are done in the example scripts. As such, here is a "better" way of doing
things: It turns out that the --big-cpu-clock
value(s), when passed on to a
CpuCluster
sub-class, creates a new SrcClockDomain
according to that value.
Therefore, there are 2 solutions (of which I have only tested the first):
-
Create sub-classes of the
CpuCluster
. Similar to the existingBigCluster
andLittleCluster
sub-classes, these will extendCpuCluster
. However, in addition to the config that these classes specify in their body, also define the two lists of values for the voltage and clock domains respectively. Then, simply pass these lists as the appropriate arguments to thesuper
call at the end of the sub-class's__init__
declaration (3rd and 4th argument at the time of writing, but double-check with your<gem5-root>/configs/example/arm/devices.py
file). If you want to add DVFS to theAtomicCluster
as well, simply extend this class in a similar manner. FINALLY, make sure to add an entry to thecpu_types
dictionary near the end of the file. The entry should have a name for the--cpu-type
flag to refer to your classes by, and a 2-tuple (a pair) of clusters for it to instantiate (i.e. put your new DVFS-capable classes here). Your specified DVFS values will now be run when using those clusters. -
As mentioned previously, the value(s) passed to the
--big-cpu-clock
flag is used to create a newSrcClockDomain
internally. Hence, another (possibly more flexible) solution is to add a--big-cpu-voltage
flag, wire up its values in the configuration script (e.g.<gem5-root>/configs/example/arm/fs_bigLITTLE.py
), and pass a list of values for each of the four flags (both voltage and clock for both big and LITTLE cpus).