Skip to content

atgenomix/seqslab-connector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

seqslab-connector

The SeqsLab Connector for Python based on pyhive allows you to create a Python DB API connection to Atgenomix SeqsLab interactive jobs (clusters) and develop Python-based workflow applications. It is a Hive-Thrift-based client with no dependencies on ODBC or JDBC. It also provides a SQLAlchemy dialect and an Apache Superset database engine spec for use with tools to execute DQL.

You are welcome to file an issue for general use cases. You can also contact Atgenomix Support here.

Requirements

Python 3.7 or above is required.

Installation

Install using pip.

pip install seqslab-connector

For Apache Superset integration install with

pip install seqslab-connector[superset]

Usage

DB-API

from seqslab import hive

conn = hive.connect(database='run_name', http_path='job_run_id', username='user', password='pass', host='job_cluster_host')
cursor = conn.cursor()
cursor.execute('SHOW TABLES')
print(cursor.fetchall())
cursor.execute('SELECT * FROM my_workflow_table_name LIMIT 10')
print(cursor.fetchall())
cursor.close()

SQLAlchemy

from sqlalchemy.engine import create_engine

engine = create_engine('seqslab+hive://user:pass@job_cluster_host/run_name?http_path=job_run_id')

Apache Superset

Connecting to Databases

Documentation

For the latest documentation, see SeqsLab.