Skip to content

Releases: tfwillems/HipSTR

HipSTR v0.7

17 Apr 16:42
Compare
Choose a tag to compare

This release provides a few enhancements to the HipSTR algorithm:

  1. Added --lib-field option and removed --lib-from-samp
  2. Automatically trim Illumina adapters (TruSeq and Nextera) from input alignments
  3. Fixed bug in which AB and FS were automatically set to 0 for homozygous genotypes even if the haplotypes are heterozygous
  4. Added functionality to output additional haplotype information about STR calls. In addition to the STR sequence, HipSTR can now output the flaking sequence genotypes
  5. Update usage instruction in README

HipSTR v0.6.2

22 Jul 18:07
Compare
Choose a tag to compare

This release introduces a handful of minor code updates:

  1. The internal htslib library has been upgradged from v1.5 to v1.8 to improve reliability
  2. The maximum allowed stutter artifact size was doubled. This results in improved genotyping accuracy with only a slight increase in run time
  3. A handful of minor changes to the README to clarify usage and suggested filtering options

HipSTR v0.6.1

19 Dec 01:39
Compare
Choose a tag to compare

This release patches a simple bug introduced in v0.6, in which an incorrect absolute value function was used.

HipSTR v0.6

19 Dec 00:47
Compare
Choose a tag to compare

This version of HipSTR contains a handful of minor bug fixes as well as some substantial speed improvements in terms of CRAM IO. Here's a summary of the changes:

  1. Modified filter_vcf.py so that it doesn't remove alleles if genotype likelihoods are present. Previously, this script would've generated invalid genotype likelihoods if any alleles were removed
  2. Massively sped up CRAM IO. Analyzing CRAM files should now take 5-30x less IO time due to these changes. This greatly improves overall HipSTR performance for CRAMs, where this was previously the bottleneck. See issue #24
  3. Fixed a bug in which the stutter model failed to converge, even though its parameters did not change over many iterations. See issue #46
  4. Modified the genotyping process so that it no longer skips a locus if a very short STR allele is observed. See issue #45
  5. Added additional descriptions of some of the default filters applied by HipSTR to the README

HipSTR v0.5

05 Aug 21:14
Compare
Choose a tag to compare

The latest version of HipSTR contains a handful of minor bug fixes but mainly focuses on adding functionality that makes the tool more robust and easier to run. Here's a summary of the changes in the latest release:

  • Added an FS FORMAT field to the VCF. This field can be used to detect genotyping errors when strand bias is unusually high
  • Added --quiet and --silent command line options that can be used to control the level of detail in the log
  • Rigorously check that all input files (STR region BED, BAM/CRAMs and SNP VCFs) are consistent in terms of contig names
  • Resolved a handful of python3 incompatibilities in the VizAln and VizAlnPDF scripts and they now work with both python2/python3
  • All contigs in the input FASTA file are now written to the VCF header. This prevents downstream errors when validating HipSTR's VCFs using Picard/GATK
  • Added command line options --max-hap-flanks and --min-flank-freq to control the candidate haplotype sequences that flank the STR that are considered during genotyping
  • Automatically filter low frequency flanking haplotype sequences (freq <1%) to improve runtime and reduce memory usage and
  • Added a --output-filters command line option that adds a FILTER FORMAT field to the VCF. For samples with missing genotypes, it describes why the sample was skipped, while for other samples it merely contains PASS

HipSTR v0.4

24 Apr 11:32
Compare
Choose a tag to compare

A lot has changed since the last official release of HipSTR (v0.2). We've extensively improved HipSTR's genotyping accuracy, simplified the tool's usage and added new features. Here's a short synopsis of some of what's changed:

  • We removed the PhasedBEAGLE component of the tool, as it's no longer relevant
  • We removed all dependencies on bamtools and vcflib to simplify compilation
  • HipSTR now uses htslib and wrapper files to read BAM and VCF files
  • HipSTR now supports alignments in CRAM format in addition to BAM format
  • The --use-all-reads option is deprecated, as HipSTR now always uses all reads to boost accuracy
  • A new FORMAT field AB is output to the VCF. This field can be used to filter genotypes with highly biased read counts that are likely genotyping errors
  • The ---min-mapq, --len-genotyper, --hide-allreads, --hide-mallreads, --output-pallreads and --no-pool-seqs options are no longer available
  • de Bruijn graphs are used to assemble sequences flanking the STR region. These sequences are incorporated into the genotyping process, resulting in improved genotyping accuracy

HipSTR v0.2

29 Mar 04:56
Compare
Choose a tag to compare

Fixed a wide range of minor bugs and substantially improved genotyping accuracy

HipSTR v0.1

17 Oct 05:26
Compare
Choose a tag to compare

I'm very excited to announce the first official release of HipSTR, a haplotype-based caller for short tandem repeats. Over the past year, I've worked extensively on this tool and feel that it's finally ready for widespread use. HipSTR is a substantial improvement over existing STR genotypers as it explicitly learns stutter models for each STR locus and utilizes a customized hidden Markov model to align reads while accounting for these artifacts. The result is a tool with unprecedented speed and accuracy for genotyping STRs. As this is the first release, we expect that there will be a handful of bugs and would appreciate if you could report these as issues in the github repo.

Thanks for using HipSTR!!

P.S. Github currently has a bug in which submodule code is not included in releases. As a result, the compressed packages below won't correctly compile. Please use either the precompiled binaries or follow the install instructions on the HipSTR main page.

Thomas