-
Notifications
You must be signed in to change notification settings - Fork 9
Installation of the kmer_tools conda evironment
These instructions assume you have already conda installed (e.g. miniconda, anaconda, mamba). To check your conda is function you can execute
conda --help
if you don't see the conda help page, do not proceed further.
Set up a new conda envirnment and activate it
conda create -n kmer_tools
conda activate kmer_tools
note that you can save condas to alternative places by specifying something like --prefix=/full/path/to/conda/env/kmer_tools instead of the parameter -n
.
Now we can proceed by installing individual k-mer tools. Once you will execute following lines, it will be fetching the data from the repositories for a moment and prompt you to confirm you want to download the 400MB of packages (contains both python and R, therefore the relatively big size)
conda install -c bioconda numpy smudgeplot kmc kat
And finally, install Genomescope 2.0 (that is unfortunately not on conda). First, download it, then enter the directory and start an R session
git clone https://github.com/tbenavi1/genomescope2.0
cd genomescope2.0
R
now, within R run following commnads to install 2 dependencies and the genomescope package
install.packages('minpack.lm', repos = "http://cran.us.r-project.org")
install.packages('argparse', repos = "http://cran.us.r-project.org")
install.packages('.', repos=NULL, type="source")
q() # quit R
and the one last step is to copy the genomescope execution script to your conda (again in your shell)
cp genomescope.R "$CONDA_PREFIX"/bin
Now you are done, so you can delete the downloaded repository
cd .. && rm -rf genomescope2.0
Check all the tools run smoothly by printing their help pages
smudgeplot.py -h
genomescope.R -h
kat --help # note -h would drop a bit cryptic error "Throw in function Mode parseMode"
kmc -h
If any of the tools do not work as expected (e.g. showing an error message instead of the help
page), try to first google the error. If that does not work, contact you local computational support - it will be most likely a cluster/system specific error and they will be able to help you faster than anyone else. Then you can try to post on our slack if anyone had the same issue, and if not, you will be left with no other choice than contacting the developers of individual tools.
Introduction
k-mer spectra analysis
- 📖 Introduction to K-mer spectra analysis
- 📖 Basics of genome modeling
- ⚒ manual model fitting (for better understanding of the underlying model)
- ⚒ simple diploid
- ⚒ demonstrating the effect of sequencing error rate on k-mer coverage
- 📖 Common difficulties in characterisation of diploid genomes using k mer spectra analysis
- ⚒ low coverage (pitfall) - to be merged
- ⚒ very homozygous diploid
- ⚒ highly heterozygous diploid
- ⚒ Genome size of a repetitive genome (pitfall)
- ⚒ Wrong ploidy (pitfall)
- 📖 Characterization of polyploid genomes using k mer spectra analysis
- ⚒ Autotetraploid
- ⚒ Allotetraploid
- ⚒ Estimating ploidy (smudgeplot)
- 📖 Genome modeling as a quality control
- ⚒ Contamination (pitfall)
- ⚒ k-mers in an assembly (Mercury/KAT)
- 📖 Analysing genome skimming data
Separation of chromosomes
- 📖Separate sub-genomes of an allopolyploid
- 📖Separating chromosomes by comparison of sequencing libraries
Species assignment using short k-mers
Others