Skip to content

Latest commit

 

History

History
84 lines (68 loc) · 2.57 KB

README.md

File metadata and controls

84 lines (68 loc) · 2.57 KB

Arrow Flight SQL - ADBC vs. JDBC

This repo is intended to benchmark ADBC and JDBC drivers connecting a client to a running Flight SQL database server.

Setup

Clone the repo

git clone https://github.com/voltrondata/flight-sql-adbc-vs-jdbc
cd flight-sql-adbc-vs-jdbc

Create a new Python 3.9+ virtual environment:

# Create the virtual environment
python3 -m venv ./venv
# Activate the virtual environment
. ./venv/bin/activate
# Update pip
pip install --upgrade pip
# Install requirements
pip install -r ./requirements.txt

Create a local TPC-H Scale Factor 1 (1 GB) database (it will be created in your local data directory)

python create_local_duckdb_database.py

Run a Flight SQL Server with a TPC-H Scale Factor 1 (1GB) database - with Docker

pushd data
# Run the flight-sql docker container image - and mount the host's DuckDB database file created above inside the container
docker run --name flight-sql \
           --detach \
           --rm \
           --tty \
           --init \
           --publish 31337:31337 \
           --env FLIGHT_PASSWORD="flight_password" \
           --pull missing \
           --mount type=bind,source=$(pwd),target=/opt/flight_sql/data \
           --env DATABASE_FILE_NAME="tpch_sf1.duckdb" \
           voltrondata/flight-sql:latest

popd

For more details - see this repo for instructions on how to run Flight SQL in Docker...

Create a .env file with a FLIGHT_PASSWORD env var in the repo root directory (change to whatever password you ran the Flight SQL server with)

echo "FLIGHT_PASSWORD=flight_password" > ./.env

Option 1 - Run all benchmarks (and generate graph in folder: graph_output)

python run_benchmarks.py

Option 2 - Run individual benchmarks

Run ADBC example

python benchmark_adbc.py

Run JDBC - Py4J example

python benchmark_jdbc_py4j.py

Run JDBC - PyArrow example

python benchmark_jdbc_super_jar.py

Run DuckDB local Database example

python benchmark_duckdb.py