Major changes are:
Support for analysis of PacBio RNA-seq data, especially single cell data,
easily usable via the new preset scywalker_pacbio
cg process_project and cg process_sample check a lot more parameters, and
return an error if they are wrong (e.g. given file does not exist), before
starting the actual run. (issue derijkp/scywalker#12)
Various memory/storage optimizations (to avoid slowdown due to swapping, or killed jobs for memory overuse)
- The option -dmaxmem was added to limit the total memory used (requested) when running local distribution.
- memory reservations adjusted, e.g. minimap2 request memory based on the
index size, realign_gatk memory request was increased, and bam_clean request (enough) memory - sorting (using gnusort8) compresses temporary files using zstd (to avoid filling up /tmp when sorting huge files).
It also uses a larger block size for speed - The -scratchdir option was added for when /tmp is too small on your system
Reference databases
- mm10 and mm39 have been updated, adding e.g. evaSNP
- makedbs/makedbs_tair10.sh was added to create the Arabidopsis reference
- The groupchromosomes option allows grouping of (smaller) chromosomes different from the default (chrX_XXX grouped under chrX_)
This is usefull for genomes where naming follows a different pattern, e.g. the dual Hs and Mm genome - minimap2 now checks the size of the genome when creating an index
Several smaller new features
- added cg bam2cram and cg_brams2crams (for bulk conversion of bam to cram)
- added cg samcat
- new options for cg qsub: -memlimit, -lang, -cores, -mem, -time, -priority, -submitoptions
- new options for cg qjobs: -s (-summary)
- cg tsv210x options: -round (to create integer matrix), -remdups (to remove duplicate lines)
- added cg download_bigbed
Many other fixes and improvements
- look for alternative fieldnames for gene in sc_filter_default and tsv210x (issue derijkp/scywalker#10)
- aligner_preset is added to analysisinfo (cg map)
- gzfiles and jobgzfile optimizations: use less globs/filesystem accesses, return values bsorted (per pattern)
- ...
Full Changelog: 0.109.0...0.110.0