- Programs installed/being installed
Property | value |
---|---|
prog_name | samtools |
publication | https://www.ncbi.nlm.nih.gov/pubmed/19505943 |
citations_num | 18873 (2019.05.07) |
first_release_year | 2009? |
www | http://www.htslib.org/ |
repo | https://github.com/samtools/samtools |
lang | C |
obtained_from | https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2 |
installed_version | 1.9 |
installed_version_date | 2018.07.18 |
newest_version | 1.12 |
newest_version_date | 2021.03.17 |
last_ver_check | 2021.03.18 |
requirements_1 | cc/gcc |
install_1 | foo-server |
install_1_dir | /usr/local/bin/samtools |
Use libdeflate to build htslib first. Link: https://github.com/ebiggers/libdeflate
- view/sort SAM/BAM/CRAM files
- index fasta
Property | value |
---|---|
prog_name | picard |
publication | nope |
citations_num | ??? |
first_release_year | ??? |
www | http://broadinstitute.github.io/picard/ |
repo | https://github.com/broadinstitute/picard |
lang | java |
obtained_from | https://github.com/broadinstitute/picard/releases/download/2.21.2/picard.jar |
installed_version | 2.21.2 |
installed_version_date | 2019.10.29 |
newest_version | 2.25.5 |
newest_version_date | 2021.05.18 |
last_ver_check | 2021.05.18 |
requirements_1 | java 1.8 |
documentation | http://broadinstitute.github.io/picard/ |
install_1 | foo-server |
install_1_dir | /opt/soft/picard_current/ |
install_1_admin | darked |
install_2 | bar-server |
install_2_dir | /opt/soft/picard_current/ (not updated recently) |
install_2_admin | darked |
- view/sort SAM/BAM/CRAM files
- index fasta
Property | value |
---|---|
prog_name | sambamba |
publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765878/ |
citations_num | 185 (2019.05.07) |
first_release_year | 2012? |
www | http://lomereiter.github.io/sambamba/ |
repo | https://github.com/biod/sambamba |
lang | Dlang |
obtained_from | https://github.com/biod/sambamba/releases/download/v0.7.0/sambamba-0.7.0-linux-static.gz |
installed_version | 0.7.0 |
installed_version_date | 2019.05.29 |
newest_version | 0.8.0 |
newest_version_date | 2020.11.30 |
last_ver_check | 2020.12.29 |
requirements_1 | none (precompiled binary) |
install_1 | foo-server |
install_1_dir | /usr/local/bin/sambamba_0.7.0 |
- view/sort SAM/BAM/CRAM files
Comment: As of 2020.12 at least in some common tasks not faster than samtools with the same level of multithreading.
Property | value |
---|---|
prog_name | BBMap |
publication | conference: https://www.osti.gov/biblio/1241166 |
citations_num | 74 (2019.0625) |
first_release_year | earlier than 2014 |
www_1 | https://sourceforge.net/projects/bbmap/ |
www_2 | https://jgi.doe.gov/data-and-tools/bbtools/ |
repo | ?? |
lang | java/shell/? |
obtained_from | https://sourceforge.net/projects/bbmap/files/ |
installed_version | 38.71 |
installed_version_date | 2019.10.30 |
newest_version | 38.90 |
newest_version_date | 2021.02.03 |
last_ver_check | 2021.02.07 |
requirements_1 | java |
install_1 | foo-server |
install_1_dir | /opt/soft/bbmap_38.70 |
install_1_admin | darked |
install_2 | bar-server |
install_2_dir | /opt/soft/bbmap_38.67 |
install_2_admin | darked |
#download on a command line:
curl -sSL "https://sourceforge.net/projects/bbmap/files/BBMap_38.71.tar.gz/download" > BBMap_38.71.tar.gz
tar xfv BBMap_38.71.tar.gz
mv bbmap bbmap_38.71
mv -i bbmap_38.71/ /opt/soft/
cd /opt/soft/
ln -s bbmap_38.71 bbmap_current
- cluster and simplify fastq read names ( clumpify.sh )
#example command
/DATA/darked89/soft/bbmap_current/clumpify.sh \
in=idsc-13p_merged_r1.fq \
in2=idsc-13p_merged_r2.fq \
out=idsc-13p_merged_r1.fq.gz \
out2=idsc-13p_merged_r2.fq.gz \
reorder shortname=shrink
#fish shell
for fn in frombam.r1.fq
/opt/soft/bbmap_38.59/clumpify.sh \
in=$fn \
in2=(basename $fn r1.fq)r2.fq \
out=(basename $fn r1.fq)clump.r1.fq.gz \
out2=(basename $fn r1.fq)clump.r2.fq.gz \
reorder shortname=shrink
end
- count kmers in fastq file(s)
/opt/soft/bbmap_current/kmercountexact.sh \
in=06a_S2_L001_r1.fq \
out=06a_S2_L001_r1.kmercount_bbmap
mincount=10000 \
k=8
# output: 06a_S2_L001_r1.kmercount_bbmap
<snip>
>11009
GAGTTGGT
>19055
GATCTGCT
>10025
GCACTCTT
>45528
GCAGCCTG
<snip>
Property | value |
---|---|
prog_name | IGV |
publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3346182/ |
citations_num | |
first_release_year | 2010?? |
www | http://software.broadinstitute.org/software/igv/home |
repo | https://github.com/igvteam/igv |
lang | java |
obtained_from | https://data.broadinstitute.org/igv/projects/downloads/2.6/IGV_Linux_2.6.3.zip |
installed_version | 2.6.3 |
installed_version_date | 2019.08.23 |
newest_version | 2.9.4 |
newest_version_date | 2021.03.17 |
last_ver_check | 2021.03.18 |
requirements_1 | java |
install_1 | foo-server |
install_1_dir | /opt/soft/igv_2.6.3 |
install_2 | bar-server |
install_2_dir | /opt/soft/igv_2.6.3 |
Property | value |
---|---|
prog_name | FastQC |
publication | in press(??) |
citations_num | |
first_release_year | ??? |
www | http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ |
repo | https://github.com/s-andrews/FastQC |
lang | java/ |
obtained_from | http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip |
installed_version | 0.11.8 |
installed_version_date | 2018.10.04 |
newest_version | 0.11.9 |
newest_version_date | 2020.01.08 |
last_ver_check | 2020.12.29 |
requirements_1 | java |
install_1 | foo-server |
install_1_dir | /opt/soft/fastqc_0.11.8 |
- fastq quality check
Property | value |
---|---|
prog_name | QoRTs |
publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4506620/ |
citations_num | 80 |
first_release_year | 2015 |
www | http://hartleys.github.io/QoRTs/ |
repo | https://github.com/hartleys/QoRTs |
lang_1 | java |
lang_2 | R |
obtained_from | https://github.com/hartleys/QoRTs/archive/v1.3.6.tar.gz |
installed_version | 1.3.6 |
installed_version_date | 2019.03.26 |
newest_version | 1.3.6 |
newest_version_date | see above |
last_ver_check | 2020.12.29 |
requirements_1 | java (put versions) |
requirements_2 | R (put versions) |
install_1 | foo-server |
install_1_dir | /opt/soft/qorts_1.3.6/ |
install_1_admin | darked |
install_2 | bar-server |
install_2_dir | /opt/soft/qorts_1.3.6/ |
install_2_admin | darked |
#usage
java -jar /opt/soft/qorts_1.3.6/QoRTs.jar QC \
cbrako-fix039_PT_0_2.rna.mrgd_4.clump.r12.star_hg38p13.bam \
/opt/genome/hg38/gencode.v31.annotation.gtf \
tmp_qual_data/
#more options:
java -jar /opt/soft/qorts_1.3.6/QoRTs.jar --man QC
Property | value |
---|---|
prog_name | multiqc |
publication | https://academic.oup.com/bioinformatics/article/32/19/3047/2196507 |
citations_num | 317 (2019.06.24) |
first_release_year | 2016? |
www | https://multiqc.info/ |
repo | https://github.com/ewels/MultiQC |
lang | python |
obtained_from | pip |
installed_version | 1.7 |
installed_version_date | 2018.12.21 |
newest_version | 1.10 |
newest_version_date | 2021.03.08 |
last_ver_check | 2021.03.18 |
requirements_1 | pip / python 3 for 1.9 |
install_1 | foo-server |
install_1_dir | /usr/local/bin/multiqc |
# to create summary report from i.e. fastqc data for multiple files
multiqc .
Property | value |
---|---|
prog_name | Bedtools |
publication | in press(??) |
citations_num | |
first_release_year | ??? |
www | ??? |
repo | https://github.com/arq5x/bedtools2 |
lang | C++ |
obtained_from | https://github.com/arq5x/bedtools2/releases/download/v2.28.0/bedtools-2.28.0.tar.gz |
installed_version | 2.29.0 |
installed_version_date | 2019.09.03 |
newest_version | 2.30.0 |
newest_version_date | 2021.01.23 |
last_ver_check | 2021.02.07 |
requirements_1 | g++ /?? |
docs | https://bedtools.readthedocs.io/en/latest/ |
tutorial | http://quinlanlab.org/tutorials/bedtools/bedtools.html |
install_1 | foo-server |
install_1_dir | /opt/soft/bedtools_2.29.0 |
install_1_admin | darked |
install_2 | bar-server |
install_2_dir | /opt/soft/bedtools_2.29.0 |
install_2_admin | darked |
Property | value |
---|---|
prog_name | bam-readcount |
publication | ??? |
citations_num | ??? |
first_release_year | 2011 |
www | ??? |
repo | https://github.com/genome/bam-readcount |
lang_1 | C++ |
obtained_from | https://github.com/genome/bam-readcount/archive/v0.8.0.tar.gz |
version | 0.8.0 |
version_date | 2016.10.22 |
last_ver_check | 2019.06.24 |
requirements_1 | cmake |
install_1 | foo-server |
install_1_dir | /opt/soft/bam-readcount_0.8.0/ |
- stats at a single base resolution for the selected positions
# install
cd bam-readcount_0.8.0
mkdir build
cd build
cmake ..
make
make test
Not developed since 2016
An approximate sequence pattern matcher for FASTQ/FASTA files.
Property | value |
---|---|
prog_name | fqgrep |
publication | none / https://zenodo.org/record/45105 |
citations_num | 24? (2019.06.25) |
first_release_year | 2011 |
www | |
repo | https://github.com/indraniel/fqgrep |
lang | C |
obtained_from | https://github.com/indraniel/fqgrep/archive/v0.4.4.tar.gz |
version | 0.4.4 |
version_date | 2016.01.22 |
last_ver_check | 2019.06.25 |
requirements_1 | libtre-dev |
install_1 | foo-server |
install_1_dir | /opt/soft/fqgrep_0.4.4/ |
# prerequisites (on Debian)
sudo apt install libtre5 libtre-dev
# 'make' creates the executable.
make
mkdir ./bin
mv -i fqgrep ./bin
# simple search for a given pattern
# searches for TGAAGAGA anywhere in the read, no mismatches, colored output visible in most
fqgrep -c -p 'TGAAGAGA' 06a_S2_L001_r1.fq | most
# search with reporting start/end positions of the pattern, sequence etc.
# the """ grep TGAAGAGA | awk '{print $7}' | sort -n | uniq -c """ part shows starting position distribution/counts
fqgrep -r -p 'TGAAGAGA' 06a_S2_L001_r1.fq | grep TGAAGAGA | awk '{print $7}' | sort -n | uniq -c
# with '-m 2' => two mismatches allowed
fqgrep -r -m2 -p 'TGAAGAGA' 06a_S2_L001_r2.fq | grep TGAAGAGA | most
https://github.com/ngsutils/ngsutils
Property | value |
---|---|
prog_name | bedops |
publication | https://academic.oup.com/bioinformatics/article/28/14/1919/218826 |
citations_num | 334 (2019.09.12) |
first_release_year | 2012? |
www | https://bedops.readthedocs.io/en/latest/ |
repo | https://github.com/bedops/bedops |
lang | C++ |
obtained_from | https://github.com/bedops/bedops/releases/download/v2.4.36/bedops_linux_x86_64-v2.4.36.tar.bz2 |
installed_version | 2.4.37 |
installed_version_date | 2019.05.02 |
newest_version | 2.4.39 |
newest_version_date | 2020.04.07 |
last_ver_check | 2020.12.31 |
install_1 | foo-server |
install_1_dir | /opt/soft/bedops_2.4.37/ |
install_1_admin | darked |
install_2 | bar-server |
install_2_dir | /opt/soft/bedops_2.4.36/ |
install_2_admin | darked |
# distributed as a precompiled binaries
# caution: tar is unpacking to ./bin
#example usage
awk '{ if ($0 ~ "transcript_id") print $0; else print $0" transcript_id \"\";"; }' gencode.v31.annotation.no_head.gtf | gtf2bed - > \
gencode.v31.annotation.no_head.bed
Property | value |
---|---|
prog_name | bam |
publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4448687/ |
citations_num | 56 (2019.06.17) |
first_release_year | 2015? |
www | https://genome.sph.umich.edu/wiki/BamUtil |
repo | https://github.com/statgen/bamUtil |
lang | C++ |
obtained_from | see repo |
version | 0.8.0 (??) |
version_date | 2019.04.20 |
last_ver_check | 2019.09.04 |
requirements_1 | libStatGen |
requirements_1_repo | https://github.com/statgen/libStatGen |
install_1 | foo-server |
install_1_dir | /opt/soft/bamutil_20190617/ |
install_2 | bar-server |
install_2_dir | /opt/soft/bamutil_20190904/ |
git clone git://github.com/statgen/bamUtil.git
git clone git://github.com/statgen/libStatGen.git
mv bamUtil/ bamutil_20190617
cd bamutil_20190617
make all
#not very informative:
make test
#it runs but does not report neither test passing nor errors
bin/bam stats --in /mnt/vdb1/darked89/proj/mongra_20190506/BWA_bam/6_S1_L001_r12.bwa.bam --basic
Number of records read = 6172826
TotalReads(e6) 6.17
MappedReads(e6) 6.15
PairedReads(e6) 6.17
ProperPair(e6) 6.09
DuplicateReads(e6) 0.00
QCFailureReads(e6) 0.00
MappingRate(%) 99.67
PairedReads(%) 100.00
ProperPair(%) 98.65
DupRate(%) 0.00
QCFailRate(%) 0.00
TotalBases(e6) 468.77
BasesInMappedReads(e6) 467.23
#!/usr/bin/fish
for fn in *bam
/opt/soft/bamutil_current/bin/bam dedup --in $fn --out (basename $fn .bam).md.bam --force --oneChrom --verbose
end
Property | value |
---|---|
prog_name | vcfanno |
publication | https://www.ncbi.nlm.nih.gov/pubmed/19505943 |
citations_num | 18873 (2019.05.07) |
first_release_year | 2009? |
www | http://www.htslib.org/ |
repo | https://github.com/brentp/vcfanno |
lang | Go |
obtained_from | https://github.com/brentp/vcfanno/releases/download/v0.3.1/vcfanno_linux64 |
installed_version | 0.3.1 |
installed_version_date | 2018.10.29 |
newest_version | 0.3.2 |
newest_version_date | 2019.07.30 |
last_ver_check | 2020.12.31 |
requirements_1 | ?? Lua ?? |
install_1 | foo-server |
install_1_dir | /opt/soft/vcfanno_0.3.1 |
- primary use:
vcfanno allows you to quickly annotate your VCF with any number of INFO fields from any number of VCFs or BED files.
- status: not tested
Property | value |
---|---|
prog_name | jellyfish |
publication | https://academic.oup.com/bioinformatics/article/27/6/764/234905 |
citations_num | 999 (2019.06.25) |
first_release_year | 2011 |
www | http://www.genome.umd.edu/jellyfish.html |
repo | https://github.com/gmarcais/Jellyfish |
lang | C++ |
obtained_from | https://github.com/gmarcais/Jellyfish/releases/download/v2.2.10/jellyfish-2.2.10.tar.gz |
installed_version | 2.2.10 |
installed_version_date | 2018.05.01 |
newest_version | 2.3.0 |
newest_version_date | 2019.07.13 |
last_ver_check | 2020.12.31 |
requirements_1 | ?? |
install_1 | foo-server |
install_1_dir | /opt/soft/jellyfish_2.2.10 |
# install from source
autoreconf -i
./configure --prefix=/opt/soft
make
make check
make install
# test run
jellyfish bc -m 8 -s 10G -t 16 -o 06a_S2_L001_r12.bc 06a_S2_L001_r1.fq 06a_S2_L001_r2.fq
jellyfish count -m 8 -s 3G -t 16 --bc 06a_S2_L001_r12.bc 06a_S2_L001_r1.fq 06a_S2_L001_r2.fq
# this creates a default mer_counts.jf file
ncbi SRA Tools: https://github.com/ncbi/sra-tools
Property | value |
---|---|
prog_name | sratoolkit |
publication | ? |
citations_num | ? |
first_release_year | 2011 |
wiki | https://github.com/ncbi/sra-tools/wiki |
repo | https://github.com/ncbi/sra-tools |
lang | C++ |
obtained_from | https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.10.7/sratoolkit.2.10.7-ubuntu64.tar.gz |
installed_version | 2.10.7 |
installed_version_date | 2020.05.27 |
newest_version | 2.10.9 |
newest_version_date | 2020.12.16 |
last_ver_check | 2020.12.31 |
requirements_1 | ?? |
install_1 | vagrant_deb_buster |
install_1_dir |
Works faster using Aspera Client or rather aspera cli software. Get it from: https://downloads.asperasoft.com/
# to get the particular run:
prefetch SRR5272532