reproman/
is the main Python module where major development is happening, with major submodules being:cmdline/
- helpers for accessinginterface/
functionality from command lineinterface/
- high level interface functions which get exposed via command line (cmdline/
) or Python (reproman.api
).tests/
- some unit- and regression- tests (more could be found undertests/
of corresponding submodules)utils.py
provides convenience helpers used by unit-tests such as@with_tree
,@serve_path_via_http
and other decorators
ui/
- user-level interactions, such as messages about errors, warnings, progress reports, AND when supported by available frontend -- interactive dialogssupport/
- various support modules, e.g. for git/git-annex interfaces, constraints for theinterface/
, etc
docs/
- yet to be heavily populated documentationbash-completions
- bash and zsh completion setup for reproman (justsource
it)
tools/
contains helper utilities used during development, testing, and benchmarking of ReproMan. Implemented in any most appropriate language (Python, bash, etc.)
The preferred way to contribute to the ReproMan code base is to fork the main repository on GitHub. Here we outline the workflow used by the developers:
-
Have a clone of our main project repository as
origin
remote in your git:git clone git://github.com/ReproNim/reproman
-
Fork the project repository: click on the 'Fork' button near the top of the page. This creates a copy of the code base under your account on the GitHub server.
-
Add your forked clone as a remote to the local clone you already have on your local disk:
git remote add gh-YourLogin git@github.com:YourLogin/reproman.git git fetch gh-YourLogin
To ease addition of other github repositories as remotes, here is a little bash function/script to add to your
~/.bashrc
:ghremote () { url="$1" proj=${url##*/} url_=${url%/*} login=${url_##*/} git remote add gh-$login $url git fetch gh-$login }
thus you could simply run:
ghremote git@github.com:YourLogin/reproman.git
to add the above
gh-YourLogin
remote. Additional handy aliases such asghpr
(to fetch existing pr from someone's remote) andghsendpr
could be found at yarikoptic's bash config file -
Create a branch (generally off the
origin/master
) to hold your changes:git checkout -b nf-my-feature
and start making changes. Ideally, use a prefix signaling the purpose of the branch
nf-
for new featuresbf-
for bug fixesrf-
for refactoringdoc-
for documentation contributions (including in the code docstrings). We recommend to not work in themaster
branch!
-
Work on this copy on your computer using Git to do the version control. When you're done editing, do:
git add modified_files git commit
to record your changes in Git. Ideally, prefix your commit messages with the
NF
,BF
,RF
,DOC
similar to the branch name prefixes, but you could also useTST
for commits concerned solely with tests, andBK
to signal that the commit causes a breakage (e.g. of tests) at that point. Multiple entries could be listed joined with a+
(e.g.rf+doc-
). Seegit log
for examples. If a commit closes an existing ReproMan issue, then add to the end of the message(Closes #ISSUE_NUMER)
-
Push to GitHub with:
git push -u gh-YourLogin nf-my-feature
Finally, go to the web page of your fork of the ReproMan repo, and click 'Pull request' (PR) to send your changes to the maintainers for review. This will send an email to the committers. You can commit new changes to this branch and keep pushing to your remote -- github automagically adds them to your previously opened PR.
(If any of the above seems like magic to you, then look up the Git documentation on the web.)
See README.md:Dependencies for basic information about installation of reproman itself. On Debian-based systems we recommend to enable NeuroDebian since we use it to provide backports of recent fixed external modules we depend upon.
apt-get install -y -q eatmydata # to speed up subsequent installations
eatmydata apt-get install -y -q python3-{appdirs,argcomplete,humanize,mock,setuptools,six,yaml,debian,boto3,docker,tqdm,rdflib,dockerpty,docker} libssl-dev libffi-dev
and additionally, for development we suggest to use tox and new versions of dependencies from pypi:
eatmydata apt-get install -y -q python3-{pip,vcr,tox}
some of which you could also install from PyPi using pip (prior installation of those libraries listed above might be necessary)
pip install -e .[devel]
Note that you might need to get an updated pip if above pip install
command fails. You could achieve that by running
pip install --upgrade pip
In case you want a complete set of development tools, e.g. to build documentation, run tests requiring nibabel etc, first install necessary core dependencies using apt-get
eatmydata apt-get install -y -q python3-{numpy,nibabel,sphinx,dev} ipython3
and then run
pip install -e '.[devel]'
to install any possibly other additional pip-provided Python library.
We use NumPy standard for the description of parameters docstrings. If you are using
PyCharm, set your project settings (Tools
-> Python integrated tools
-> Docstring format
).
In addition, we follow the guidelines of Restructured Text with the additional features and treatments provided by Sphinx.
PYTHONPATH=$PWD make -C docs html
in the top directory, and then built documentation should become available
under docs/build/html/
directory.
-
For merge commits to have more informative description, add to your
.git/config
or~/.gitconfig
following section:[merge] log = true
and if conflicts occur, provide short summary on how they were resolved in "Conflicts" listing within the merge commit (see example).
It is recommended to check that your contribution complies with the following rules before submitting a pull request:
-
All public methods should have informative docstrings with sample usage presented as doctests when appropriate.
-
All other tests pass when everything is rebuilt from scratch.
-
New code should be accompanied by tests.
reproman/tests
contains tests for the core portion of the project, and
more tests are provided under corresponding submodules in tests/
subdirectories to simplify re-running the tests concerning that portion
of the codebase. To execute many tests, the codebase first needs to be
"installed" in order to generate scripts for the entry points. For
that, the recommended course of action is to use venv
(Python virtual
environment), e.g.
python3 -m venv venvs/dev3
source venvs/dev3/bin/activate
pip install -r requirements-devel.txt
Then use that virtual environment to run the tests, via
python -m pytest
or just
pytest
then to later deactivate the venv just simply enter
deactivate
Alternatively, or complimentary to that, you can use tox
-- there is a tox.ini
file which sets up a few virtual environments for testing locally, which you can
later reuse like any other regular venv for troubleshooting.
Additionally, tools/testing/test_README_in_docker script can
be used to establish a clean docker environment (based on any NeuroDebian-supported
release of Debian or Ubuntu) with all dependencies listed in README.md pre-installed.
We rely on https://codecov.io to provide convenient view of code coverage. Installation of the codecov extension for Firefox/Iceweasel or Chromium is strongly advised, since it provides coverage annotation of pull requests.
We are not (yet) fully PEP8 compliant, so please use these tools as guidelines for your contributions, but not to PEP8 entire code base.
Sidenote: watch Raymond Hettinger - Beyond PEP 8
-
No pyflakes warnings, check with:
pip install pyflakes pyflakes path/to/module.py
-
No PEP8 warnings, check with:
pip install pep8 pep8 path/to/module.py
-
AutoPEP8 can help you fix some of the easy redundant errors:
pip install autopep8 autopep8 path/to/pep8.py
Also, some team developers use
PyCharm community edition which
provides built-in PEP8 checker and handy tools such as smart
splits/joins making it easier to maintain code following the PEP8
recommendations. NeuroDebian provides pycharm-community-sloppy
package to ease pycharm installation even further.
A great way to start contributing to ReproMan is to pick an item from the list of Easy issues in the issue tracker. Resolving these issues allows you to start contributing to the project without much prior knowledge. Your assistance in this area will be greatly appreciated by the more experienced developers as it helps free up their time to concentrate on other issues.
-
While performing IO/net heavy operations use dstat for quick logging of various health stats in a separate terminal window:
dstat -c --top-cpu -d --top-bio --top-latency --net
-
To monitor speed of any data pipelining pv is really handy, just plug it in the middle of your pipe.
-
For remote debugging epdb could be used (avail in pip) by using
import epdb; epdb.serve()
in Python code and then connecting to it withpython -c "import epdb; epdb.connect()".
-
We are using codecov which has extensions for the popular browsers (Firefox, Chrome) which annotates pull requests on github regarding changed coverage.
Refer reproman/config.py for information on how to add these environment variables to the config file and their naming convention
- REPROMAN_LOGLEVEL: Used for control the verbosity of logs printed to stdout while running reproman commands/debugging
- REPROMAN_TESTS_KEEPTEMP: Function rmtemp will not remove temporary file/directory created for testing if this flag is set
- REPROMAN_EXC_STR_TBLIMIT: This flag is used by the reproman extract_tb function which extracts and formats stack-traces. It caps the number of lines to REPROMAN_EXC_STR_TBLIMIT of pre-processed entries from traceback.
- REPROMAN_TESTS_TEMPDIR: Create a temporary directory at location specified by this flag. It is used by tests to create a temporary git directory while testing git annex archives etc
- REPROMAN_TESTS_NONETWORK: Skips network tests completely if this flag is set Examples include test for s3, git_repositories, openfmri etc
- REPROMAN_TESTS_SSH: Skips SSH tests if this flag is not set
- REPROMAN_LOGTRACEBACK:
- REPROMAN_TESTS_ASSUME_SSP:
Set this to indicate that tests can assume that Python system site
packages are exposed in the testing environment (i.e., an
environment created with
virtualenv --system-site-packages ...
would include system packages) and that there is at least one system package present. This is currently only relevant for the virtualenv distribution tests. Runs TraceBack function with collide set to True, if this flag is set to 'collide'. This replaces any common prefix between current traceback log and previous invocation with "..." - REPROMAN_TESTS_NOTEARDOWN: Does not execute teardown_package which cleans up temp files and directories created by tests if this flag is set
- REPROMAN_USECASSETTE: Specifies the location of the file to record network transactions by the VCR module. Currently used by when testing custom special remotes
- REPROMAN_CMD_PROTOCOL: Specifies the protocol number used by the Runner to note shell command or python function call times and allows for dry runs. 'externals-time' for ExecutionTimeExternalsProtocol, 'time' for ExecutionTimeProtocol and 'null' for NullProtocol. Any new REPROMAN_CMD_PROTOCOL has to implement reproman.support.protocol.ProtocolInterface
- REPROMAN_CMD_PROTOCOL_PREFIX: Sets a prefix to add before the command call times are noted by REPROMAN_CMD_PROTOCOL.
- REPROMAN_PROTOCOL_REMOTE: Binary flag to specify whether to test protocol interactions of custom remote with annex
- REPROMAN_LOG_TIMESTAMP: Used to add timestamp to reproman logs
- REPROMAN_RUN_CMDLINE_TESTS: Binary flag to specify if shell testing using shunit2 to be carried out
- REPROMAN_TEMP_FS: Specify the temporary file system to use as loop device for testing REPROMAN_TESTS_TEMPDIR creation
- REPROMAN_TEMP_FS_SIZE: Specify the size of temporary file system to use as loop device for testing REPROMAN_TESTS_TEMPDIR creation