Stable branch: main
The goal of DHSCdatatools is to provide a suite of tools for using data hosted on the DHSC analytical cloud (DAC) platform. For detailed developer documentation click here.
-
Local installation of Simba Spark ODBC Driver 32-bit and Simba Spark ODBC Driver 64-bit.
-
A new conda environment is recommended for the package. In Git Bash:
conda create -n <your_environment_name> python==3.12 pip
Some of the dependencies of this package are not currently compatible with the latest Python 3.13. Use any python version from and including 3.8 and below 3.13. E.g. above
python==3.12
is specified.
- Though not strictly a package dependency, we recommend you install python-dotenv to work with
.env
files.
In Git Bash, with the relevant environment activated:
pip install python-dotenv
In Git Bash, with the relevant environment activated, to install dhsc_data_tools:
pip install git+https://github.com/DataS-DHSC/dhsc-data-tools.git
A .env file containing tenant name and key vault name is required for dhsc_data_tools.dac_odbc.connect()
and dhsc_data_tools.keyvault.KVConnection()
.
Please find the .env file in the Data Science Teams space DAC channel.
Place this file in your working directory.
IMPORTANT
Ensure in each project your .gitignore
file excludes config, .env
, and relevant yaml files.
If you do accidentally commit these files (or any other sensitive data) please get in touch with the Data Science Hub to discuss how best to mitigate the breach.
from dhsc_data_tools import dac_odbc
from dotenv import load_dotenv
load_dotenv(".env")
#create client
conn = dac_odbc.connect()
# Run a SQL query by using the preceding connection.
cursor = conn.cursor()
cursor.execute("SELECT * FROM samples.nyctaxi.trips LIMIT 10")
# Print the rows retrieved from the query.
for row in cursor.fetchall():
print(row)
# For help, you can run
help(dac_odbc.connect) # or with any other module
Please note that the DHSCdatatools project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Unless stated otherwise, the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation.
All other content is © Crown copyright and available under the terms of the Open Government 3.0 licence, except where otherwise stated.