Skip to content

Latest commit

 

History

History
48 lines (31 loc) · 1.18 KB

README.rst

File metadata and controls

48 lines (31 loc) · 1.18 KB

SeqPandas

Import genomic data to get a custom Pandas & Biopython hybrid class object with fancy shortcuts to make Machine Learning preprocessing easy!

Installation

pip install seqpandas

Usage

import seqpandas as spd

# Direct File Path
df = spd.read_seq('file.fasta', format='fasta')
df = spd.read_seq('file.sam', format='sam')
df = spd.read_vcf('file.vcf', format='vcf')
df = spd.read_bed('file.bed', format='bed')

# Just need BioPython Seqs? No problem!
seqrecords = spd.read('file.fasta', format='fasta')

# Already Opened BioPython Handle
from Bio import SeqIO
seqrecords = SeqIO.parse('file.fasta', format='fasta')
df = spd.BioDataFrame.from_seqrecords(seqrecords)

Tutorial

For a complete walkthrough and to use it for a machine learning pipeline please follow the tutorial notebook.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.