Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem of version 0.12.2 #2

Closed
zengxi-hada opened this issue Aug 25, 2024 · 4 comments
Closed

A problem of version 0.12.2 #2

zengxi-hada opened this issue Aug 25, 2024 · 4 comments

Comments

@zengxi-hada
Copy link

zengxi-hada commented Aug 25, 2024

I am running sawfish-v0.12.2 on this bam file https://s3-us-west-2.amazonaws.com/human-pangenomics/working/HPRC_PLUS/HG002/analysis/aligned_reads/hifi/GRCh38/HG002_aligned_GRCh38_winnowmap.sorted.bam using the command line:
sawfish discover --threads 16 --ref Homo_sapiens_assembly38.fasta --bam HG002_aligned_GRCh38_winnowmap.sorted.bam --output-dir HG002_discover_dir

The following error happens:
thread '<unnamed>' panicked at src/refine_sv/trimmed_reads.rs:197:5: assertion failed: !is_hard_clipped(&record.cigar())
Below is the full running log:

[2024-08-25][09:31:28][sawfish][INFO] Starting sawfish
[2024-08-25][09:31:28][sawfish][INFO] cmdline: /home/xiz978/bin/BCH/sawfish-v0.12.2-x86_64-unknown-linux-gnu/bin/sawfish discover --threads 16 --ref /n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta --bam /n/storage_test/xiz978/Wangziying/HG002/hifi/HG002_aligned_GRCh38_winnowmap.sorted.bam --output-dir HG002_discover_dir
[2024-08-25][09:31:28][sawfish][INFO] Running on 16 threads
[2024-08-25][09:31:29][sawfish][INFO] Writing discover settings to file: 'HG002_discover_dir/discover.settings.json'
[2024-08-25][09:31:29][sawfish][INFO] Reading reference genome from file '/n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta'
[2024-08-25][09:31:58][sawfish][INFO] Processing alignment file '/n/storage_test/xiz978/Wangziying/HG002/hifi/HG002_aligned_GRCh38_winnowmap.sorted.bam'
[2024-08-25][09:36:58][sawfish][INFO] Processed alignments on 1,181,630 of 3,099,922 ref genome kb (38%)
[2024-08-25][09:42:05][sawfish][INFO] Processed alignments on 2,595,202 of 3,099,922 ref genome kb (83%)
[2024-08-25][09:43:24][sawfish][INFO] Finished processing all alignments
[2024-08-25][09:43:24][sawfish][INFO] Writing max sv depth genome regions to bed file: 'HG002_discover_dir/max.depth.bed'
[2024-08-25][09:43:24][sawfish][INFO] Getting depth bin GC content
[2024-08-25][09:43:27][sawfish][INFO] Getting GC depth correction levels
[2024-08-25][09:43:27][sawfish][INFO] Completing breakpoint observation clustering
[2024-08-25][09:43:28][sawfish][INFO] Finished breakpoint observation clustering
[2024-08-25][09:43:28][sawfish][INFO] Consolidating adjacent indel candidates
[2024-08-25][09:43:28][sawfish][INFO] Finding large insertion candidates
[2024-08-25][09:43:28][sawfish][INFO] Writing breakpoint_cluster debug bed file: 'HG002_discover_dir/debug.breakpoint_clusters.bed'
[2024-08-25][09:43:28][sawfish][INFO] Refining SV Candidates
[2024-08-25][09:43:28][sawfish][INFO] Writing debug assembly region bed file: 'HG002_discover_dir/assembly.regions.bed'
[2024-08-25][09:43:28][sawfish][INFO] Starting refinement of breakpoint evidence clusters
thread '<unnamed>' panicked at src/refine_sv/trimmed_reads.rs:197:5:
assertion failed: !is_hard_clipped(&record.cigar())
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
/var/spool/slurmd/job45032830/slurm_script: line 9: 17052 Aborted                 /home/xiz978/bin/BCH/sawfish-v0.12.2-x86_64-unknown-linux-gnu/bin/sawfish discover --threads 16 --ref /n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta --bam /n/storage_test/xiz978/Wangziying/HG002/hifi/HG002_aligned_GRCh38_winnowmap.sorted.bam --output-dir HG002_discover_dir
[2024-08-25][09:43:41][sawfish][INFO] Starting sawfish
[2024-08-25][09:43:41][sawfish][INFO] cmdline: /home/xiz978/bin/BCH/sawfish-v0.12.2-x86_64-unknown-linux-gnu/bin/sawfish joint-call --threads 16 --sample HG002_discover_dir --output-dir HG002_joint_call_dir
[2024-08-25][09:43:41][sawfish][INFO] Running on 16 threads
[2024-08-25][09:43:41][sawfish][INFO] Reading reference genome from file '/n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta'
[2024-08-25][09:44:10][sawfish][INFO] Reading all sample discovery input
thread '<unnamed>' panicked at /var/lib/bamboo/.cargo/registry/src/index.crates.io-6f17d22bba15001f/unwrap-1.2.1/src/lib.rs:55:25:

I am wondering whether this version can be upgraded to bypass this issue?

Thank you.

@ctsa
Copy link
Member

ctsa commented Aug 26, 2024

Thanks, I'll refer to the answer for this error on the previous issue:

#1 (comment)

...I will leave this issue open to track an improved error message update for you.

@jamesc99
Copy link

Hi,

after running Sawfish v0.12.2, the previous error in #1 has been fixed. However, I got another error regarding 'discover' step.

[2024-08-25][11:24:45][sawfish][INFO] Starting sawfish
[2024-08-25][11:24:45][sawfish][INFO] cmdline: sawfish discover --bam /users/u250191/ryan_scratch_ln/benchmark_inv/rawdata/hg002/pacbio/minimap2/sorted_PacBio_HG002_m54238_180916_191625.Q20.fastq.bam --ref /users/u250191/ryan_scratch_ln/reference/human-grch38.fasta
[2024-08-25][11:24:45][sawfish][INFO] Running on 8 threads
[2024-08-25][11:24:45][sawfish][INFO] Writing discover settings to file: 'sawfish_discover_output/discover.settings.json'
[2024-08-25][11:24:45][sawfish][INFO] Reading reference genome from file '/users/u250191/ryan_scratch_ln/reference/human-grch38.fasta'
[2024-08-25][11:25:21][sawfish][INFO] Processing alignment file '/users/u250191/ryan_scratch_ln/benchmark_inv/rawdata/hg002/pacbio/minimap2/sorted_PacBio_HG002_m54238_180916_191625.Q20.fastq.bam'
[2024-08-25][11:30:21][sawfish][INFO] Processed alignments on 3,037,641 of 3,088,286 ref genome kb (98%)
[2024-08-25][11:30:25][sawfish][INFO] Finished processing all alignments
[2024-08-25][11:30:25][sawfish][INFO] Writing max sv depth genome regions to bed file: 'sawfish_discover_output/max.depth.bed'
[2024-08-25][11:30:25][sawfish][INFO] Getting depth bin GC content
[2024-08-25][11:30:29][sawfish][INFO] Getting GC depth correction levels
[2024-08-25][11:30:29][sawfish][INFO] Completing breakpoint observation clustering
[2024-08-25][11:30:30][sawfish][INFO] Finished breakpoint observation clustering
[2024-08-25][11:30:30][sawfish][INFO] Consolidating adjacent indel candidates
[2024-08-25][11:30:30][sawfish][INFO] Finding large insertion candidates
[2024-08-25][11:30:30][sawfish][INFO] Writing breakpoint_cluster debug bed file: 'sawfish_discover_output/debug.breakpoint_clusters.bed'
[2024-08-25][11:30:31][sawfish][INFO] Refining SV Candidates
[2024-08-25][11:30:31][sawfish][INFO] Writing debug assembly region bed file: 'sawfish_discover_output/assembly.regions.bed'
[2024-08-25][11:30:31][sawfish][INFO] Starting refinement of breakpoint evidence clusters 
/var/spool/slurm/d/job540156/slurm_script: line 34: 2827442 Illegal instruction     (core dumped) sawfish discover --bam $input_bam --ref $ref38

all bam files were aligned by minimap2 with code $minimap -Y -ax map-hifi -R "@RG\tSM:${INFOR}\tID:${RG_ID}" $ref38 $FASTQ_FILE --MD -t 8. Some of them worked while some of them failed due to error above.

my script to run sawfish is attached below:

#!/bin/bash
#SBATCH --job-name=sawfish
#SBATCH --output=%x_%A_%a.out
#SBATCH --error=%x_%A_%a.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=64gb
#SBATCH --time=72:00:00
#SBATCH --partition=medium
#SBATCH -A proj-fs0002

# Load necessary modules or source conda
source /users/u250191/.bashrc
mamba activate sawfish

# Set reference and work directory
ref38="/users/u250191/ryan_scratch_ln/reference/human-grch38.fasta"
WORK_DIR=$(pwd)

# Set input parameters
input_bam=$1

# Ensure all parameters are provided
if [ -z "$input_bam" ] ; then
    echo "Usage: sbatch script.sh <input_bam>"
    exit 1
fi

# Step 1: sawfish discover
sawfish discover --bam $input_bam --ref $ref38

# Step 2: sawfish joint-call
sawfish joint-call --sample sawfish_discover_output

I initially thought this was due to a memory issue and I increased it to 144 GB on a 30x coverage BAM file, and sawfish failed again. Because you mentioned the mem per core should be less than 8 gb, in which case I may need instruction from your side.

Thanks again for your time,
Best!

@ctsa
Copy link
Member

ctsa commented Aug 27, 2024

Copied this report into a new issue #3

@ctsa
Copy link
Member

ctsa commented Sep 3, 2024

Related to the original issue on this thread, the v0.12.3 release fixes the error to a (non-panic) message describing that hard-clipped input is not currently accepted.

@ctsa ctsa closed this as completed Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants