Skip to content

Releases: huishenlab/biscuit

Version 1.6.0

16 Dec 17:17
Compare
Choose a tag to compare

Beginning with Version 1.4.0, BISCUIT no longer needs submodules. Therefore, the GitHub-generated Source code (zip) and Source code (tar.gz) links may contain all necessary files. If preferred, release-source.zip is guaranteed to contain all BISCUIT source files.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 16 December 2024

Changes affecting specific subcommands/scripts:

  • bsconv: Added -x option for maximum CpY retention in read. PR #53 and #54 from njspix.
  • vcf2bed: Changed default for the -t option from 3 to 1.
    • The original default (3) was set before the explosion of single-cell protocols. With the more single-cell data being processed, the default has been loosened to 1 to catch all locations covered by at least 1 read.

Version 1.5.0

06 May 15:24
Compare
Choose a tag to compare

Beginning with Version 1.4.0, BISCUIT no longer needs submodules. Therefore, the GitHub-generated Source code (zip) and Source code (tar.gz) links may contain all necessary files. If preferred, release-source.zip is guaranteed to contain all BISCUIT source files.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 6 May 2024

General changes:

  • Add citation to README.

Changes affecting specific subcommands/scripts:

  • Added extended CIGAR operations (= and X) for pileup and epiread.
  • epiread:
    • Added ability to work with modification SAM tags (MM/ML) with the -M flag and the -y FLOAT option, which adjusts the minimum probability that a modification is correct.
    • Updated the -L option from using hard coded read lengths to now accept an integer input (-L INT) specifying the maximum read length.
    • Added error messages when reads longer than the specified max length are found.

Bug Fixes:

  • Issue #49 and #51 fixed (biscuit epiread wouldn't accept gzipped or bgzipped SNP BED files). Reported by njspix and fei0810.
  • Issue #50 fixed (NM tag was being incorrectly set for C>T conversions on OB/CTOB reads). Reported by njspix.
  • Add warning message and skip processing read when a negative start position is found when creating epiBED files.

Version 1.4.0

08 Jan 18:19
Compare
Choose a tag to compare

Beginning with Version 1.4.0, BISCUIT no longer needs submodules. Therefore, the GitHub-generated Source code (zip) and Source code (tar.gz) links may contain all necessary files. If preferred, release-source.zip is guaranteed to contain all BISCUIT source files.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 8 January 2024

General changes:

  • Moved to CMake-based build system
    • CMake (min. version 3.21) must be installed to build BISCUIT
    • See documentation for more details about building with CMake
  • Removed submodule dependencies
    • Libraries that previously were submodules are now downloaded and built via CMake build process
    • --recursive option no longer necessary for cloning BISCUIT
  • klib has been removed as a full dependency
    • Specific klib utilities necessary for build are available either in lib/aln or htslib
  • Updated to htslib version 1.18
  • Release process has now been partially automated

New subcommand:

  • help: Same as just running biscuit, but allows for a clear entry point to BISCUIT's usage

Changes affecting specific subcommands/scripts:

  • bc:
    • Cell barcode and artificial UMI now output to read name (name_barcode_umi) for UMI-tools compatibility
    • Cell barcode no longer output to read comment
  • align:
    • memchain.c B-Tree traversal method updated
    • -9 option extracts both cell barcode and UMI from read name (previously only the cell barcode from the read comment) for UMI-tools compatibility
  • pileup: (and others)
    • Decoupled pileup source code from other subcommands

Bug Fixes:

  • Fixed memory leak in biscuit epiread
  • Corrected version info in biscuit version
  • Check if output SAM file can be written to in biscuit bsconv
  • Typo in --keep-tmp-files option of QC.sh. Fixed by kew24 (PR #46).

Version 1.3.0

27 Oct 14:27
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 27 October 2023

General changes:

  • Clean up Makefile

New subcommand:

  • bc: Extracts cell barcode from reads. Useful for single-cell experiments that utilize cell barcodes.
    • For usage, run biscuit bc -h

Changes affecting specific subcommands/scripts:

  • align:
    • FASTA/FASTQ read comments are now appended to read name when retained (biscuit align -C)
    • Add -9 option for extracting barcodes from read comment and placing in CR SAM tag.
  • vcf2bed:
    • Add -c option to create Bismark .cov-like format (i.e., Beta-M-U columns)
  • mergecg:
    • Add -c option to create Bismark .cov-like format (i.e., Beta-M-U columns)

Bug Fixes:

  • Issue #38 fixed (unhelpful error message when FASTQ cannot be found). Reported by KChadwick78.
  • Issue #41 fixed (SAM header not output to stdout). Reported by njspix.
  • cinread workaround for long reads. Now only looks at the first 300 bases.
  • Fix divide by zero error in QC.sh.
  • Fix issue in mergecg when CG is at the end of the chromosome.
  • Resolve segfault when passing empty SNP BED file to epiread.
  • Fix sign error when calculating insert size distribution bounds.
  • Fix incorrect formatting of bsstrand table output.

Version 1.2.1

01 Jun 15:10
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 1 June 2023

General changes:

  • GitHub workflows setup.
  • Travis CI removed.

Changes affecting specific subcommands/scripts:

  • QC.sh:
    • Remove GNU parallel dependency (courtesy of Nick Semenkovich, PR #33).
    • Set LC_ALL=C globally (courtesy of Nick Semenkovich, PR #34)
  • epiread:
    • Option to include secondary reads in processing was hidden.

Bug Fixes:

  • Issue #37 fixed (incorrect MD tag). Reported by Nick Semenkovich.

Version 1.2.0

30 Jan 17:09
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 30 January 2023

General changes:

  • Copyright years updated.

Changes affecting specific subcommands/scripts:

  • align:
    • MC (mate CIGAR) and MQ (mate quality) SAM auxiliary tags added to alignment output
  • pileup:
    • Read 1 and Read 2 overlap is now determined by the MC tag. If MC tag is missing, read lengths are assumed to be the same.
  • epiread:
    • Read 1 and Read 2 overlap is now determined by the MC tag. If MC tag is missing, read lengths are assumed to be the same.
    • Print statement for filtered epiBED lines (only F/x/P entries) is now behind the verbose (-v) flag.
    • epiBED and pileup output now match when using an unfiltered BISCUIT SNP BED as input (-B option) to biscuit epiread.
      • BISCUIT SNP BED file must be used in order to account for when methylation could be called in the case of an ambiguous C>T or G>A SNP. If not used, methylation levels might not match up between pileup and epiread.
      • Important note: The epiBED file required an overhaul to align the pileup and epiread output:
        • There are now always nine columns: 1) chromosome, 2) start, 3) end, 4) read name, 5) read number, 6) bisulfite strand, 7) CpG methylation RLE string, 8) GpC methylation RLE string ("." if BS-seq), and 9) short structural variants (SNPs/indels) RLE string.
        • SNPs/indels follow the same conventions as before in the variant RLE string. For the CpG and GpC RLE strings, "I" is used for insertions, "d" for deletions, and "x" for SNPs with no methylation status.
        • All methylation now occurs in the position it is found in the reference. Previously, methylation always was situated relative to the C position in the CG. Now, methylation will be noted at the C position for OT/CTOT (bisulfite strand +) and the G position for OB/CTOB (bisulfite strand -). This lines up with how methylation is accounted for in pileup.
      • Special thanks to the Brendel group at Indiana University for bringing up the difference between the epiBED results and pileup and providing additional testing.
    • Results for the old epiread format (-O) and the pairwise format (-P) remain the same as before. Therefore, results matching pileup are not guaranteed. These formats remain for legacy analyses and it is suggested that analyses move to the epiBED format.

Bug Fixes:

  • Issue #26 fixed (incorrect number of degrees of freedom in biscuit asm). Reported by Volker Brendel.
  • Issue #28 (segfault when G in a CG is first base of a OB/CTOB read pair). Reported by Volker Brendel.
  • Issue #32 (incorrect number of threads requested for GNU parallel). Reported by Nick Semenkovich. Fixed by Nick Semenkovich.

Version 1.1.0

07 Jul 16:04
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 7 July 2022

General changes:

  • Copyright years updated.
  • Implicit fallthrough warnings fixed for more recent versions of GCC.
  • Makefile has no explicit calls to gcc. All calls happen through $CC.

Changes affecting specific subcommands/scripts:

  • epiread:
    • Gzipped BED files are now able to be used as inputs to -B.
    • Empty reads (those with only F, x, and/or P in them) are removed by default. If included, the -E option will leave these reads in the output file.
    • Add long read functionality (-L option).
    • Add "-" in old epiread format for C's that are filtered to maintain alignment with C's occurring in the reference.
  • asm:
    • Fisher's exact and chi-squared p-values are now output in scientific notation, rather than as decimals.

Bug Fixes:

  • Issue #19 fixed (memory leak in -p option of biscuit align).
  • Issue #21 fixed (implicit reliance on alphabetical ordering of SQ tags in BAM).
  • Issue #22 fixed (incorrect NOMe output in pileup).
  • Fix reading Bismark BS strand flags from BAM.

Version 1.0.2

13 Jan 21:19
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 13 January 2022

Changes affecting specific subcommands/scripts:

  • epiread:
    • The end column in the epiBED file is now adjusted for insertions occurring in the RLE string. This allows the RLE string to properly line up with the aligned location in the reference genome.
    • 1's have been removed from the RLE strings (i.e., A1B2C1 is now AB2C). While this could have been done initially, it was easier downstream to leave the 1's in when reconstructing the original string from the RLE string. The downstream code is now able to handle the 1's not being there, so they can now be removed from biscuit.

Bug Fixes:

  • The asset creator is now able to handle gzipped references. This had been available via download in version 1.0.1, but is now accessible in the scripts/ directory of the source zip file.
  • The number of deletions was not being properly accounted for previously in biscuit epiread.

Version 1.0.1

18 Oct 18:47
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 18 October 2021

Bug Fixes:

  • Fixed bug where retention/conversion counts for read-averaged cytosine conversion could overflow when running biscuit qc on large BAMs
  • Fixed bug in biscuit qc where the sample name could be too long, causing a seg fault

Note, as of 5 November 2021, the build_biscuit_QC_assets.pl script was updated to fix a bug where it wouldn't work with gzipped FASTA files.

Version 1.0.0

17 Sep 17:52
Compare
Choose a tag to compare

Source code (zip) and Source code (tar.gz) are generated by GitHub and may not contain all required submodules. Use these at your own risk! It is suggested to use the release-source.zip if downloading the source code.

Important Note: This version is not backwards compatible with BISCUIT Version 0.3.16 and earlier.

Date Created:

  • 17 September 2021

General Changes:

  • Contact person for BISCUIT updated
  • biscuit markdup removed from API
  • Unnecessary files removed from code base
  • All subcommands now have a help option (-h)
  • General consistency applied for help output
  • Small changes to wording of option descriptions (default values included if not obvious)
  • Error messages added for missing command line arguments
  • Flag/option characters changes for consistency across subcommands (see next section for specific changes)
  • Perl script to generate QC asset files from a reference FASTA file (build_biscuit_QC_assets.pl)
    • For help in running: perl build_biscuit_QC_assets.pl -h
  • Bash script added to flip PBAT reads in silico which makes viewing PBAT data in IGV easier (flip_pbat_reads.sh)
    • For help in running: bash flip_pbat_ready.sh -h

New Subcommands:

  • qc Generate QC files from input BAM (does not include coverage of base-averaged methylation, use QC.sh for full QC)
  • version Prints BISCUIT version

Changes affecting specific subcommands/scripts:

  • asm:
    • Include check to see if input is in the correct format
  • bsconv:
    • Flag change: -b-p for printing output in TSV format
    • Split into header and implementation files
    • CpY filter added (functionality is similar to CpH filter, but restricted to CpC and CpT)
  • bsstrand:
    • Split into header and implementation files
  • cinread:
    • Split into header and implementation files
  • epiread:
    • Flag change: -q-@ for number of threads to use
    • Filters added to bring pileup and epiread filters into alignment
      • Includes filters for alignment score (-a), minimum base quality (-b), minimum distance to 5' and 3' ends of read (-5 and -3), and double counting overlapping bases (double counting avoided by default, -d allows double counting)
      • Default values are the same for the filters in both subcommands
    • Updated default to the epiBED format (see the documentation site for more details on the epiBED format)
    • Can output the old epiread format using the -O option
      • Can optionally print all CpG/GpC/SNP locations using the -A option in conjunction with -O (note, the output when running with -A is not compatible with rectangle)
  • mergecg:
    • Flag change: -n-N for NOMe-seq mode
  • tview:
    • Flag change: -f was not originally being read, now is included
  • pileup:
    • Flag change: -q-@ for number of threads
    • Flag change: added in -s flag for step of window dispatching
    • Flag change: -e flag separated into two flags - -5 and -3
      • -5 is the minimum distance from the 5' end of the read (default: 3)
      • -3 is the minimum distance from the 3' end of the read (default: 3)
  • align:
    • Flag change: -t-@ for number of threads to use
    • Flag change: -h-g for max number of hits output in XA tag
    • Flag change: -h now used for help
  • QC.sh:
    • Updated script to reflect changes made to flag/option characters
    • Updated script to include the qc subcommand
    • Added the [-n,--no-cov-qc] option to skip generating the coverage QC files
      • Greatly decreases the runtime when running with this option (especially for large datasets)

Bug Fixes:

  • Fixed incorrect rounding of methylation fractions (Issue #7)
  • Fixed bug where the minimum distance to end of read filter in pileup and epiread filtered one less base from the 5' end than requested (i.e., INT-1 bases were filtered for -e INT or -5 INT)
  • CX INFO header line in pileup now prints correct CX tags when running in NOMe-seq mode