Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.
/ pylibseq Public archive

Python interface to libsequence via pybind11

License

GPL-3.0, GPL-3.0 licenses found

Licenses found

GPL-3.0
LICENSE.txt
GPL-3.0
COPYING
Notifications You must be signed in to change notification settings

molpopgen/pylibseq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pylibseq: Python bindings for libsequence

This package provides Python bindings for the C++11 library libsequence.

The bindings are implemented using pybind11.

This package serves two roles:

  • It provides a means of using some of the more widely-used bits of libsequence within the Python language
  • The unit tests of this package also serve as unit tests for libsequence.

What this package does not (currently) do:

  • provide an interface for I/O operations. Python I/O and C++ I/O are fundamentally very different. Bridging the gap requires either adding features to pybind11 and/or adding modules to this package that depend on the boost Python interface, which would add an additional C++ dependency to this package.

Build status

Master branch:

Travis CI Build Status (master branch) https://circleci.com/gh/molpopgen/pylibseq/tree/master.svg?style=svg

Development branch:

Travis CI Build Status (dev branch) https://circleci.com/gh/molpopgen/pylibseq/tree/dev.svg?style=svg

Citation

If you use this software for your research, please cite:

@ARTICLE{Thornton2003-wj,
  title    = "Libsequence: a C++ class library for evolutionary genetic
              analysis",
  author   = "Thornton, Kevin",
  abstract = "UNLABELLED: A C++ class library is available to facilitate the
              implementation of software for genomics and sequence polymorphism
              analysis. The library implements methods for data manipulation
              and the calculation of several statistics commonly used to
              analyze SNP data. The object-oriented design of the library is
              intended to be extensible, allowing users to design custom
              classes for their own needs. In addition, routines are provided
              to process samples generated by a widely used coalescent
              simulation. AVAILABILITY: The source code (in C++) is available
              from http://www.molpopgen.org",
  journal  = "Bioinformatics",
  volume   =  19,
  number   =  17,
  pages    = "2325--2327",
  month    =  nov,
  year     =  2003,
  url      = "https://www.ncbi.nlm.nih.gov/pubmed/14630667",
  language = "en",
  issn     = "1367-4803",
  pmid     = "14630667",
  doi      = "10.1093/bioinformatics/btg316"
}

Requirements:

  • Python 3
  • cmake
  • An up-to-date C++ compiler that is C++11 compatible via the flag -std=c++11. Roughly, this means GCC >= 4.8 and clang >= 3.5.

Note

As of version 0.2.2, libsequence is included as a git submodule compiled directly into the Python package.

If you installing from GitHub, then pybind11 is a dependency. Further, pybind11 must not be installed from a source like PyPi. Rather, it must be installed either from source or via your favorite package manager. The reason is that we use their cmake macros during the build process.

Changelog (rough)

  • 0.2.0: The package has been completely refactored. We now use pybind11 to integrate C++ and Python. Previous versions of this project used Cython. The API now corrresponds to libsequence 1.9.2. Python >= 3.4 is required.
  • 0.1.9: Made memory management more robust: more unique_ptr instead of raw pointers. Cleanup __dealloc__ functions in extension types. Package now sets __version__. Class names are now "Pythonic" (and identical to the corresponding type names from libsequence) due to aliasing the C++ names from libsequence. Change from distutils to setuptools. Documentation fixes. Expose haplotype diversity and number of haplotype statistics. First (very alpha) release of pymsstats.
  • 0.1.8: made sure C++ objects/fxns are declared "nogil". Raw pointers replaced with C++'s unique_ptr.
  • 0.1.7: improvements to build system. Add option to build from GitHub.
  • 0.1.6: update to libsequence 1.8.9. Add --use-cython option to setup.py

Installation:

For many users, the best way to install the latest release will be via bioconda,

conda -c bioconda install pylibseq

The latest release of the package is available via PyPi, and can be installed with your favorite Python package manager:

$ pip install --upgrade pylibseq

Or, you may install from GitHub:

Note

The GitHub version does not contain the .cpp files generated by pybind11. You need to generate those!

$ git clone http://github.com/molpopgen/pylibseq
$ cd pylibseq
$ git submodule init
$ git submodule update
$ ./configure
$ sudo pip install .

You may also install from GitHub using pip:

$ pip install git+git://github.com/molpopgen/pylibseq

Unit testing:

$ ./configure
$ python setup.py build_ext -i
$ python -m unittest discover tests

Documentation: