Skip to content

Commit

Permalink
Adjust input length on nextclade runs
Browse files Browse the repository at this point in the history
Only run nextclade on sequences at least 1400 nt long, the approximate length
of the dengue E gene. This would avoid misclassification on short sequences.
  • Loading branch information
j23414 committed Jan 2, 2024
1 parent 90ad5a9 commit 5205fd0
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions ingest/workflow/snakemake_rules/nextclade.smk
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,16 @@ rule nextclade_all:
output:
"data/nextclade_results/nextclade_all.tsv",
threads: 4
params:
min_length=1400, # approximately E gene length
shell:
"""
nextclade run \
--input-dataset {input.dataset} \
-j {threads} \
--output-tsv {output} \
--min-match-rate 0.01 \
--min-length {params.min_length} \
--silent \
{input.sequences}
"""
Expand Down

0 comments on commit 5205fd0

Please sign in to comment.