ZWAL is a new WAL design for ZNS that uses zone appends
instead of writes
. It leads to significantly higher write throughput and concurrency compared to writes without resorting to excessive buffering I/O on the host. The implementation is build on top of ZenFS.
This code is part of the ZWAL: Rethinking Write-ahead Logs for ZNS SSDs paper at the "4th Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems".
ZWALs have the same requirements as ZenFS, but additionally requires the SimpleZNSDevice library (SZD) and Linux with support for io_uring with NVMe passthrough (> 6.0).
First clone the source base with all submodules included:
git clone https://github.com/stonet-research/zwal.git --recursive
Install libzbd (tested on 0ab157e):
pushd libzbd
sh ./autogen.sh
./configure
make
sudo make install
popd
Install SZD (use the io_uring
branch):
pushd SimpleZNSDevice
# Setup SPDK and DPDK (we will not use them, but SZD needs them for dependencies)
cd spdk
./configure
make -j
sudo make install
cd ..
mkdir -p build && cd build
rm -f CMakeCache.txt
cmake ..
make
sudo make install
ZWALs come with a number of configuration options that are defined in #define
directives.
These must be set before compilation (also see build.sh
on examples). Apart from this the build is no different from ZenFS. ZWALs do require a specific change in RocksDB, hence we ship RocksDB along with ZWALs (see rocksdb-raw
).
rm -r rocksdb-raw/plugin/zenfs
cp zenfs plugin/zenfs-appends rocksdb-raw/plugin/zenfs
# Set WAL buffersize
BUFFSIZE=4
sed -i "s/#define SPARSE_BUFFER_SIZE_IN_KB.*/#define SPARSE_BUFFER_SIZE_IN_KB ${BUFFSIZE}UL/g" rocksdb-raw/plugin/zenfs/fs/io_zenfs.h
# set WAL max depth
MAXWALDEPTH=32
sed -i "s/NAMELESS_WAL_DEPTH.*/NAMELESS_WAL_DEPTH ${MAXWALDEPTH}/g" rocksdb-raw/plugin/zenfs/fs/zbd_zenfs.h
# Set WAL barriersize
WALBARRIERSIZE=16384
sed -i "s/#define WAL_BARRIER_SIZE_IN_KB.*/#define WAL_BARRIER_SIZE_IN_KB ${WALBARRIERSIZE}UL/g" rocksdb-raw/plugin/zenfs/fs/io_zenfs.h
cd rocksdb
DEBUG_LEVEL=0 ROCKSDB_PLUGINS=zenfs make -j48 db_bench
sudo DEBUG_LEVEL=0 ROCKSDB_PLUGINS=zenfs make install
cd plugin/zenfs/util
make
The formatting procedure is the same as for ZenFS. However, we only evaluated extensively under the default configuration of:
echo deadline | sudo tee /sys/class/block/<zoned block device>/queue/scheduler
rocksdb-raw/plugin/zenfs/util/zenfs mkfs --zbd=<zoned block device> --aux_path=<path to store LOG and LOCK files>
We provide no guarantees for other ZenFS functionalities.
To reproduce the results of our paper, follow the instuctions in AE.md.
AE.md
: Artifact Evaluation. Contains a description of how to reproduce all results from the paper.zenfs-appends
: ZenFS with ZWALs implemented.zenfs-default
: Standard ZenFS (included for easy experimentation). We added support for variable buffer sizes (changeSPARSE_BUFFER_SIZE_IN_KB
infs/io_zenfs.h
).rocksdb-raw
: RocksDB modified to delete WALs instantly instead of archiving them.rocksdb-ycsb
: The same asrocksdb-raw
, but always forces RocksDB to use ZenFS, and forces some options (WAL_size
). Necessary to use with YCSB.ycsb
: The YCSB benchmark, modified to support RocksDB with ZenFSdata
: raw data from all our experimentsfio-with-appends
: fio modified to support appends for io_uring with NVMe passthrough