SeqPandas

Import genomic data to get a custom Pandas & Biopython hybrid class object with fancy shortcuts to make Machine Learning preprocessing easy!

Free software: MIT license
Documentation: https://seqpandas.readthedocs.io.

Installation

pip install seqpandas

Usage

import seqpandas as spd

# Direct File Path
df = spd.read_seq('file.fasta', format='fasta')
df = spd.read_seq('file.sam', format='sam')
df = spd.read_vcf('file.vcf', format='vcf')
df = spd.read_bed('file.bed', format='bed')

# Just need BioPython Seqs? No problem!
seqrecords = spd.read('file.fasta', format='fasta')

# Already Opened BioPython Handle
from Bio import SeqIO
seqrecords = SeqIO.parse('file.fasta', format='fasta')
df = spd.BioDataFrame.from_seqrecords(seqrecords)

Tutorial

For a complete walkthrough and to use it for a machine learning pipeline please follow the tutorial notebook.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

SeqPandas

Installation

Usage

Tutorial

Credits

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

SeqPandas

Installation

Usage

Tutorial

Credits