Skip to content

0.110.0

Latest
Compare
Choose a tag to compare
@derijkp derijkp released this 14 Jul 09:23
· 53 commits to master since this release

Major changes are:

Support for analysis of PacBio RNA-seq data, especially single cell data,
easily usable via the new preset scywalker_pacbio

cg process_project and cg process_sample check a lot more parameters, and
return an error if they are wrong (e.g. given file does not exist), before
starting the actual run. (issue derijkp/scywalker#12)

Various memory/storage optimizations (to avoid slowdown due to swapping, or killed jobs for memory overuse)

  • The option -dmaxmem was added to limit the total memory used (requested) when running local distribution.
  • memory reservations adjusted, e.g. minimap2 request memory based on the
    index size, realign_gatk memory request was increased, and bam_clean request (enough) memory
  • sorting (using gnusort8) compresses temporary files using zstd (to avoid filling up /tmp when sorting huge files).
    It also uses a larger block size for speed
  • The -scratchdir option was added for when /tmp is too small on your system

Reference databases

  • mm10 and mm39 have been updated, adding e.g. evaSNP
  • makedbs/makedbs_tair10.sh was added to create the Arabidopsis reference
  • The groupchromosomes option allows grouping of (smaller) chromosomes different from the default (chrX_XXX grouped under chrX_)
    This is usefull for genomes where naming follows a different pattern, e.g. the dual Hs and Mm genome
  • minimap2 now checks the size of the genome when creating an index

Several smaller new features

  • added cg bam2cram and cg_brams2crams (for bulk conversion of bam to cram)
  • added cg samcat
  • new options for cg qsub: -memlimit, -lang, -cores, -mem, -time, -priority, -submitoptions
  • new options for cg qjobs: -s (-summary)
  • cg tsv210x options: -round (to create integer matrix), -remdups (to remove duplicate lines)
  • added cg download_bigbed

Many other fixes and improvements

  • look for alternative fieldnames for gene in sc_filter_default and tsv210x (issue derijkp/scywalker#10)
  • aligner_preset is added to analysisinfo (cg map)
  • gzfiles and jobgzfile optimizations: use less globs/filesystem accesses, return values bsorted (per pattern)
  • ...

Full Changelog: 0.109.0...0.110.0