This repo contains Docker images for Vespa development on AlmaLinux 8 (Vespa 8). vespa-build-almalinux-8 is used for only building Vespa, while vespa-dev-almalinux-8 is used for active development of Vespa with building, unit testing and running of system tests. vespa-dev-almalinux-8 depends on vespa-build-almalinux-8. To pull the images:
docker pull docker.io/vespaengine/vespa-build-almalinux-8:latest
docker pull docker.io/vespaengine/vespa-dev-almalinux-8:latest
Commits to master will automatically trigger new builds and deployment on Docker Hub.
Read more at the Vespa project homepage.
The project is covered by the Apache License, Version 2.0.
This guide describes how to build, unit test and system test Vespa on AlmaLinux 8 using Docker or Podman.
Change from docker
to podman
in the commands below if using Podman.
When doing Vespa development it is important that the turnaround time between code changes and running unit tests and system tests is short. vespa-dev-almalinux-8 provides a complete environment for this. The code is compiled using mvn, cmake and make and then installed into your personal install directory. Vespa can be executed directly from this directory when for instance running system tests.
Make sure Docker has sufficient resources:
Open Docker - Preferences - Resources and set:
- CPUs: Minimum 2. Use 8 or more for faster build times.
- Memory: Minimum 8 GB. 16 GB or more is preferred.
- Disk size: 128 GB.
Make sure Docker can be executed without sudo for the scripts in this guide to work:
sudo groupadd docker
sudo usermod -aG docker $(id -un)
sudo systemctl restart docker
Log out and login again; or run sudo su - $USER
command to continue.
Install Podman Desktop:
brew install podman-desktop
Create a new Podman Machine with sufficient resources (Preferences - Resources - Create new ...)
- CPUs: Minimum 2. Use 8 or more for faster build times.
- Memory: Minimum 8 GB. 16 GB or more is preferred.
- Disk size: 128 GB.
- Machine with root privileges: Enabled
The Podman Machine can also be created using podman machine init
:
podman machine init --cpus=8 --memory=16384 --disk-size=128 --rootful
docker pull docker.io/vespaengine/vespa-dev-almalinux-8:latest
If you want to be able to attach a remote debugger (e.g. IntelliJ) to a process inside the container, you need to add port forwarding at this stage. It cannot be done after the container has been created. To allow debugging on port 5005, insert the following line in between the lines to the command in the appropriate section below:
-p 127.0.0.1:5005:5005 \
First, create a long lived Docker volume. This lets us persist data generated by and used by the Docker container. Skip this step if the volume already exists.
docker volume create volume-vespa-dev-almalinux-8
Second, create the container by mounting the volume as the home directory inside the container:
docker create \
-p 127.0.0.1:3334:22 \
-v volume-vespa-dev-almalinux-8:/home/$(id -un) \
--privileged \
--pids-limit -1 \
--name vespa-dev-almalinux-8 \
docker.io/vespaengine/vespa-dev-almalinux-8:latest
A directory on the host machine can be mounted into the container using the -v option. This lets us persist data generated by and used by the Docker container. When running Docker on a Linux host there is basically no overhead doing so. First, create a volume directory on the host:
mkdir -p $HOME/volumes/vespa-dev-almalinux-8
Second, run docker create with the -v option to mount the volume directory as the home directory in the container:
docker create \
-p 127.0.0.1:3334:22 \
-v $HOME/volumes/vespa-dev-almalinux-8:/home/$(id -un) \
--privileged \
--pids-limit -1 \
--name vespa-dev-almalinux-8 \
docker.io/vespaengine/vespa-dev-almalinux-8:latest
docker start vespa-dev-almalinux-8
Ensure you have an SSH key before running the configure-container.sh
script.
If not, use the following guide
to generate a new SSH key.
mkdir -p $HOME/git
cd $HOME/git
git clone git@github.com:vespa-engine/docker-image-dev.git
cd $HOME/git/docker-image-dev/dev/almalinux-8
If using Docker:
./configure-container.sh docker vespa-dev-almalinux-8
Or, if using Podman:
./configure-container.sh podman vespa-dev-almalinux-8
This adds yourself as user in the container, copies authorized keys to ensure ssh can be used, and sets environment variables needed for building Vespa.
cd $HOME/git/docker-image-dev/dev/almalinux-8
docker build -t vespaengine/vespa-dev-almalinux-8:latest .
Use this for testing if doing changes to the Docker image.
ssh -A 127.0.0.1 -p 3334
If the ssh command fails, see SSH troubleshooting
mkdir -p $HOME/git
cd $HOME/git
git clone git@github.com:vespa-engine/vespa.git
cd $HOME/git/vespa
If you are persisting data from a previous container, clean out old state to ensure that the latest version of build tools will be used:
git clean -fdx
ccache --clear
./bootstrap.sh java
mvn clean install --threads 1C -Dmaven.javadoc.skip=true -Dmaven.source.skip=true -DskipTests
cd $HOME/git/vespa
cmake3 .
make -j 9
Set the number of compilation threads (-j argument) to the number of CPU cores + 1.
You can use the compiler flags -march=
and -mtune=
to specify the CPU generation to build for. For details and options consult the
GCC manual.
The below command will setup building with the instruction set available on the Intel Haswell CPU generation
and optimize code generation for the even newer Intel Icelake CPU generation,
but still use only the instruction set available on Haswell.
cmake3 -DVESPA_CPU_ARCH_FLAGS="-march=haswell -mtune=skylake" .
make install/fast
Default install directory is $HOME/vespa ($VESPA_HOME).
mvn test --threads 1C
mvn test --threads 1C -pl container-search
ctest -j 9
ctest -j 9 -R "^searchlib_"
cd $HOME/git
git clone git@github.com:vespa-engine/system-test.git
Note that the system test scrips are already in your PATH inside the Docker container.
Some system tests depend on feature flag overrides.
cp $HOME/git/system-test/docker/include/feature-flags.json $HOME/vespa/var/vespa/flag.db
nodeserver.sh
cd $HOME/git/system-test/tests/search/basicsearch
runtest.sh basic_search.rb
Vespa natively supports building and running C++ code instrumented using sanitizers.
Pass the VESPA_USE_SANITIZER=sanitizer
variable to CMake, where sanitizer
must be one of the following:
address
- instrument using AddressSanitizerthread
- instrument using ThreadSanitizerundefined
- instrument using UndefinedBehaviorSanitizeraddress,undefined
instrument using both AddressSanitizer and UndefinedBehaviorSanitizer. This is the only supported option for using multiple sanitizers at the same time.
Example for generating build-files that instrument Vespa using ThreadSanitizer:
cmake3 -DVESPA_USE_SANITIZER=thread .
Note that vespamalloc is not built when sanitizers are configured, as both vespamalloc and sanitizers will attempt to intercept/override default libc malloc API calls.
Unit tests can be run as usual, both directly from the terminal and from within CLion.
If a test is flaky (especially if it involves a rare race condition), it's often useful to be able to run one particular test in a loop until it fails. Both GTest and the sanitizers can be easily configured using environment variables.
Example environment variables for running a single test case 100 times, immediately aborting if either the test fails or ThreadSanitizer detects a problem (here presented in CLion run configuration format):
GTEST_FILTER=MyFlakyTestSuite.my_flaky_test_case;GTEST_REPEAT=100;TSAN_OPTIONS=halt_on_error=1;GTEST_FAIL_FAST=1
When setting your own TSAN_OPTIONS
environment variable you may have to manually add the
suppressions
option and point it to the tsan-suppressions.txt
file found in the Vespa source code root directory to avoid getting reports for already known false positives.
This option is automatically set when running unit tests via CTest.
Note that you cannot run an instrumented unit test under Valgrind.
As with unit tests, system tests can be run as usual with no extra setup needed. However, since system tests run with many instrumented processes simultaneously, it's useful to configure sanitizers to emit per-process error logs and to suppress known, benign warnings.
Processes are launched in the context of the system test node server, so export any environment variables prior to launching it.
Example (substitute paths with your own):
export TSAN_OPTIONS="suppressions=/home/myuser/git/vespa/tsan-suppressions.txt log_path=/home/myuser/tsan_logs/log history_size=7 detect_deadlocks=1 second_deadlock_stack=1"
nodeserver.sh
If processes emit fatal sanitizer warnings on startup, e.g:
==51385==FATAL: ThreadSanitizer: failed to intercept munmap
then this is usually a sign that there are traces of a previous (non-instrumented) vespamalloc
build in your Vespa install tree. Vespa startup scripts will implicitly pick up and load
vespamalloc if it's present, regardless of instrumentation status. The easiest way to get
around this is to wipe the install tree and re-run make install
.
XQuartz is a version of the X.Org X Window System for macOS. Download here.
Set AddressFamily inet
inside /etc/ssh/sshd_config
and restart sshd:
sudo kill -HUP <sshd-pid>
Open a XQuartz terminal and run:
ssh -Y -A 127.0.0.1 -p 3334
Then start CLion or IntelliJ from this terminal.
If the ssh command fails, e.g. with the following message:
ssh kex_exchange_identification: Connection closed by remote host
then, execute an interactive shell on the container:
docker exec -it vespa-dev-almalinux-8 /bin/bash
Inside the shell, check if there are any host keys:
ls -l /etc/ssh
If the folder does not contain any ssh_host_*
files, use this command to generate host keys:
sudo ssh-keygen -A
Then, start the ssh daemon:
$(which sshd)
If you need to debug further, add the flags -Ddp
to the above command. In another terminal, try to ssh
into the container again with the appropriate level of verbosity, e.g.
ssh -vvv -A 127.0.0.1 -p 3334