RELI (Regulatory Element Locus Intersection) is an algorithm for discovering transcription factors (TFs) that bind a significant number of loci associated with a given disease or phenotype (e.g., through a Genome Wide Association study, or GWAS).
The major data components are
-
An input set of disease or phenotype-associated genetic variants (RS IDs)
-
An internal “library” consisting of many ChIP-seq dataset peaks (in the form of .bed files)
-
An internal file containing information on genetic variant allele frequencies, etc.
To assess the significance of the intersection between the input disease variants and a given TF ChIP-seq dataset, RELI performs simulations, generating a null distribution used for P-value calculations.
The output of RELI is a series of statistics based upon the significance of the overlap between the input genetic variants and the selected ChIP-seq dataset.
Additional details on RELI and the associated findings can be found in its accompanying publication.
If you have the Common Workflow Language's cwltool
already
installed and Docker available on your system, this is the most
straightforward way to run RELI on some sample data:
# clone the public repository and check out the 'cwl-docker-workflow' branch
git clone https://github.com/WeirauchLab/RELI.git
cd RELI
git checkout cwl-docker-workflow
# retrieve sample data and use CWL to run with example input parameters
make fetchdata
cwltool workflow/reli-docker.cwl workflow/reli-example-eu-ancestry.yaml
This will run RELI on a small input set of ChIP-seq data of European ancestry using a set of lupus (SLE)-associated SNPs.
RELI requires a C++11 compiler (e.g. GNU CC 4.7 or higher) and libgsl
and
libgslcblas
from the GNU Scientific Library.
You may download the latest release as a compressed archive from GitHub, or clone the repository with Git:
# GitHub
git clone https://github.com/WeirauchLab/RELI.git
# Weirauch Lab GitLab
git clone https://tfwebdev.research.cchmc.org/gitlab/ches2d/RELI_public.git
This is the recommended method if you have some familiarity with Docker, as it does not require you to download a compiler or any of the dependencies necessary to build RELI from source.
You will need to install the appropriate Docker client for your OS. Please see the official docs for help with that.
The most straightforward way to get started is to simply use the pre-built image available on Docker Hub:
docker run -it --rm weirauchlab/reli RELI --help
Or, if you've cloned the RELI_public
repository (see above), you can locally
build the CentOS 7-based Docker container and compile RELI from source as
follows:
cd /path/to/cloned/repo
docker build -t reli .
# test to see if it works
docker run -it --rm reli RELI --help
A GNU-style Makefile
is provided in the repository. With GSL installed
system-wide, you can build the RELI binary with just
make
then run ./RELI
with no arguments to verify that you have a working binary
(you should get a help screen).
In order to run a test analysis, you need to download the sample data either manually (see the next section) or just type
make test
which will download and validate the sample datasets automatically, then invoke
example/example_run.sh
to invoke RELI on the sample data.
This test analysis requires around 10 GB of RAM to finish successfully; 16 GB is recommended.
The included Makefile
will respect CFLAGS
and LDFLAGS
if set in the
environment, for example, if you have a locally-built GSL that is installed in
a non-standard place (such as in your home directory):
CFLAGS=-I/path/to/include LDFLAGS=-L/path/to/lib make
If g++
is not available in your PATH
(or it has a different name), you will
likely want to modify the Makefile directly, beginning around line 33 with the
CC
variable.
RELI has also been verified to build and run on the following platforms (in addition to GNU/Linux):
-
Windows with Cygwin and GCC 5.4.0 (ensure the
gcc-g++
,make
,gsl
, andlibgsl-devel
, andcurl
packages are installed, at a minimum) -
macOS 10.14.6 (Mojave) with LLVM 10.0.1 (clang-1001.0.46.4) from the Xcode Command Line Tools and GSL installed from MacPorts
-
you need to specify paths to MacPorts' includes/libs like this, before running
make
export CFLAGS=-I/opt/local/include LDFLAGS=-L/opt/local/lib
-
On Windows, make sure you run make
(or the example/example_run.sh
script)
from within the Cygwin shell, not the Windows Command Prompt or PowerShell.
You may need to lightly modify the CDT build toolchain settings if your
installation of Cygwin is not at C:\Cygwin64
.
Eclipse CDT project settings files are also included for both of the
above toolchains. Just create a copy (or symlink) of the appropriate one
called .cproject
, then choose File → Import... → Existing
Projects into Workspace and browse to where you cloned the repository.
If you have problems with make test
(perhaps you don't have curl
available), you can manually download and extract the sample datasets from
such that the decompressed data is inside a data
subdirectory, within the
RELI_public
repository you cloned above. A .zip
-format archive is also
provided, in case for some reason you don't have bzip2
available.
You can run the sample analysis by changing into the example
directory and
running example_run.sh
in a terminal like so:
user@[/path/to/repo]$ cd example
user@[/path/to/repo]$ ./example_run.sh
Required options are in bold text
Option | Explanation |
---|---|
-snp FILE |
Phenotype snp file in 4 column bed format |
-ld FILE |
(optional) Phenotype linkage disequilibrium structure for snps, default: no ld file |
-index FILE |
ChIP-seq index file |
-data DIR |
Specify directory where ChIP-seq data are stored |
-target STRING |
Target label of ChIP-seq experiment to be tested from index file |
-build FILE |
Genome build file |
-null FILE |
Null model file |
-dbsnp FILE |
dbSNP table file |
-out DIR |
Specify output directory name under currentg working folder. |
-match |
(optional) Boolean switch to turn on minor allele frequency based matching, default: off |
-rep NUMBER |
(optional) Number of permutation/simulation to be performed, default: 2000 |
-corr NUMBER |
(optional) Bonferroni correction multiplier for multiple test, default: 1 |
-phenotype STRING |
(optional) User-provided phenotype name, default: "." |
-ancestry STRING |
(optional) User provided ancestry name, default: "." |
To add an additional ChIP-seq dataset, create an entry in the ChIP-seq index
file (data/ChIPseq.index
) with the following tab-delimited format:
label ⇥ source ⇥ Cell ⇥ TF ⇥ Cell label ⇥ PMID ⇥ Group ⇥ EBV Status ⇥ Species
where label
corresponds to the filename, which you should deposit in the
data/ChIP-seq
directory (in BED 4 column format).
To use a different genome build, use the UCSC fetchChromSizes
utility
(usage information here) to download chromosome information for that
build. You may wish to prune lines representing unmapped chromosome information
(e.g., chrN_glXXXXXX_random
and chrUn_glXXXXXX
) from the downloaded data
file.
Be advised, however, that the null model included with the data was generated for Homo sapiens at build hg19; using a later "hg" build may invalidate this model.
If you need support for a different organism, please contact us via email for additional details (see "Feedback" section, below), or file an issue against the public GitHub repository.
Transcription factors operate across disease loci, with EBNA2 implicated in autoimmunity.
Harley JB, Chen X, Pujato M, Miller D, Maddox A, Forney C, Magnusen AF, Lynch A, Chetal K, Yukawa M, Barski A, Salomonis N, Kaufman KM, Kottyan LC, Weirauch MT.
Nat Genet. 2018 Apr 16. doi: 10.1038/s41588-018-0102-3. Epub 2018 Apr 16.
PMID: 29662164
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See LICENSE.txt
for more details.
Please report any issues with RELI (or feature suggestions) in our GitHub issue tracker.
With other questions, you may contact Dr. Chen (the primary author of RELI) or Dr. Weirauch via email.
Name | Institution | Remarks |
---|---|---|
Dr. Xiaoting Chen | Cincinnati Children's Hospital | primary author |
Project avatar based on Wikimedia Commons Chromosome_18.svg