KernMLOps

This repository serves as the mono-repo for the KernMLOps research project.

Currently, it only contains scripts for data collection of kernel performance.

Quick Setup:

make dependencies
make hooks

make docker-image

pip install -r requirements.txt
# Installs benchmarks without root privileges
make provision-benchmarks
# If the above fails dependencies may need to be installed with
make provision-benchmarks-admin

# Ensure you have installed your kernel's development headers
# On ubuntu: apt install linux-headers-$(uname -r)
# On redhat: dnf install kernel-devel kernel-headers
# Or
# For first time installation of tooling like kernel-headers, bcc-tools, osquery
# This is optional if docker is present, requires admin permissions, do
# make provision-development

make collect

Tools

[Python-3.12]

This is here to make the minimum python version blatant.

asdf

Provides a declarative set of tools pinned to specific versions for environmental consistency.

These tools are defined in .tool-versions. Run make dependencies to initialize a new environment.

pre-commit

A left shifting tool to consistently run a set of checks on the code repo. Our checks enforce syntax validations and formatting. We encourage contributors to use pre-commit hooks.

# install all pre-commit hooks
make hooks

# run pre-commit on repo once
make pre-commit

bcc

A toolset for building eBPF programs including python bindings.

It is highly recommended to build this from scratch so that its python package can be pip installed to your local environment. By default system packages will only install it for the system python, which may complicate development for user level package management.

Building instructions are provided here for most major distributions.

After installing, the python package can be installed to your local environment via:

pip install -e BCC_BUILD_DIR/src/python/bcc-python3

Verify proper installation with:

import bcc

ansible

Ansible is an automation tool for configuring VMs and baremetal machines.

vagrant

Vagrant is a tool for managing VMs simply across different backend like VirtualBox and libvirt, we recommend using libvirt.

Dependencies

Python

Python is required, at least version 3.12 is required for its generic typing support. This is the default version on Ubuntu 24.04.

Python package dependencies are listed in requirements.txt and can be installed via pip.

They can also be install via conda. If conda is used then it is recommended to then install mamba and use that as a drop in replacement. This can be done with conda install -conda-forge mamba.

Or by poetry.

Then the python packages can be installed via:

conda install -c conda-forge --file requirements.txt
mamba install -c conda-forge --file requirements.txt

Creating VMs

For ubuntu the requirements can be installed with:

sudo apt install -y bc flex bison gcc make libelf-dev libssl-dev \
    squashfs-tools busybox-static tree cpio curl

Contributing

Developers should verify their code passes basic standards by running:

make lint

Developers can automatically fix many common styling issues with:

make format

Usage

Users can run data collection with:

make collect-data

Troubleshooting: Or How I Learned to Shoot My Foot

eBPF Programs

eBPF Programs are statically verified when the python scripts attempt to load them to into the kernel and that is where errors will manifest. When a program fails to compile the error will be the usual sort of C-compiler error. i.e.

/virtual/main.c:53:3: error: call to undeclared function '__bpf_builtin_memset';
    ISO C99 and later do not support implicit function declarations
    [-Wimplicit-function-declaration]
   53 |   __bpf_builtin_memset(&data, 0, sizeof(data));
      |   ^
1 error generated.

For verification errors the entire compiled bytecode will be printed, look for something along the lines of:

invalid indirect read from stack R4 off -32+20 size 24
processed 59 insns (limit 1000000) max_states_per_insn 0
    total_states 3 peak_states 3 mark_read 3

eBPF Padding

The error:

invalid indirect read from stack R4 off -32+20 size 24
processed 59 insns (limit 1000000) max_states_per_insn 0
    total_states 3 peak_states 3 mark_read 3

Indicates that a read in the program is reading uninitialized memory.

That error came from:

struct quanta_runtime_perf_event data;
data.pid = pid;
data.tgid = tgid;
data.quanta_end_uptime_us = ts / 1000;
data.quanta_run_length_us = delta / 1000;
quanta_runtimes.perf_submit(ctx, &data, sizeof(data));

The invalid read was perf_submit since there was extra padding in the data struct that was not formally initialized. To be as robust as possible this should be handled with an explicit __builtin_memset as in:

struct quanta_runtime_perf_event data;
__builtin_memset(&data, 0, sizeof(data));
data.pid = pid;
data.tgid = tgid;
data.quanta_end_uptime_us = ts / 1000;
data.quanta_run_length_us = delta / 1000;
quanta_runtimes.perf_submit(ctx, &data, sizeof(data));

This gives the most robust handling for multiple systems, see here.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
benchmark/provisioning		benchmark/provisioning
python/kernmlops		python/kernmlops
.clang-format		.clang-format
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.tool-versions		.tool-versions
.yamllint.yaml		.yamllint.yaml
Dockerfile.dev		Dockerfile.dev
Makefile		Makefile
README.md		README.md
Vagrantfile		Vagrantfile
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KernMLOps

Tools

[Python-3.12]

asdf

pre-commit

bcc

ansible

vagrant

Dependencies

Python

Creating VMs

Contributing

Usage

Troubleshooting: Or How I Learned to Shoot My Foot

eBPF Programs

eBPF Padding

About

Releases

Packages

Languages

utcs-scea/KernMLOps

Folders and files

Latest commit

History

Repository files navigation

KernMLOps

Tools

[Python-3.12]

asdf

pre-commit

bcc

ansible

vagrant

Dependencies

Python

Creating VMs

Contributing

Usage

Troubleshooting: Or How I Learned to Shoot My Foot

eBPF Programs

eBPF Padding

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages