Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no transcript id #27

Closed
nick-youngblut opened this issue Apr 4, 2024 · 3 comments
Closed

no transcript id #27

nick-youngblut opened this issue Apr 4, 2024 · 3 comments

Comments

@nick-youngblut
Copy link

I'm getting the following error:

#### Aligning reads to genome using minimap2
02:20:30 AM Thu Apr 04 2024 minimap2_align
Warning: error in running command/home/rstudio/miniconda3/bin/paftools.js:1714: Error: No transcript_id
		if (id == null) throw Error("No transcript_id");
                        ^
Error: No transcript_id
    at Error (<anonymous>)
    at paf_gff2bed (/home/rstudio/miniconda3/bin/paftools.js:1714:25)
    at main (/home/rstudio/miniconda3/bin/paftools.js:3695:29)
    at /home/rstudio/miniconda3/bin/paftools.js:3721:1

My relevant code:

config_file = FLAMES::create_config(
  work_dir, type = "sc_3end", do_barcode_demultiplex = FALSE
)

sce = sc_long_pipeline(
    fastq = fastq_input,
    genome_fa = ref_fna_input,
    annotation = ref_annot_input,
    outdir = work_dir,
    config = config_file,
    minimap2_dir = minimap2_path,
    expect_cell_number = 8000
)

For the reference/annotation files, I'm using:

  • GCF_000001405.40_GRCh38.p14_genomic.fna.gz
  • GCF_000001405.40_GRCh38.p14_genomic.gff.gz

The full input parameters:

#### Input parameters:
{
  "pipeline_parameters": {
    "seed": [2022],
    "threads": [1],
    "do_barcode_demultiplex": [false],
    "do_gene_quantification": [true],
    "do_genome_alignment": [true],
    "do_isoform_identification": [true],
    "bambu_isoform_identification": [false],
    "do_read_realignment": [true],
    "do_transcript_quantification": [true]
  },
  "barcode_parameters": {
    "max_bc_editdistance": [2],
    "max_flank_editdistance": [8],
    "pattern": {
      "primer": ["CTACACGACGCTCTTCCGATCT"],
      "BC": ["NNNNNNNNNNNNNNNN"],
      "UMI": ["NNNNNNNNNNNN"],
      "polyT": ["TTTTTTTTT"]
    },
    "TSO_seq": ["CCCATGTACTCTGCGTTGATACCACTGCTT"],
    "TSO_prime": [3],
    "full_length_only": [false]
  },
  "isoform_parameters": {
    "generate_raw_isoform": [false],
    "max_dist": [10],
    "max_ts_dist": [100],
    "max_splice_match_dist": [10],
    "min_fl_exon_len": [40],
    "max_site_per_splice": [3],
    "min_sup_cnt": [5],
    "min_cnt_pct": [0.001],
    "min_sup_pct": [0.2],
    "bambu_trust_reference": [true],
    "strand_specific": [0],
    "remove_incomp_reads": [4],
    "downsample_ratio": [1]
  },
  "alignment_parameters": {
    "use_junctions": [true],
    "no_flank": [false]
  },
  "realign_parameters": {
    "use_annotation": [true]
  },
  "transcript_counting": {
    "min_tr_coverage": [0.4],
    "min_read_coverage": [0.4]
  }
} 

I'm using FLAMES 1.9.2.

Any idea on what is causing the error with paftools?

@ChangqingW
Copy link
Collaborator

ChangqingW commented Apr 4, 2024

I have ran into this a couple times before, I had to use the latest script from minimap2's repo (https://raw.githubusercontent.com/lh3/minimap2/master/misc/paftools.js) and use GTF instead of GFF.
I am working on including the js script in FLAMES and addressing the minimap2 folder issue you brought up, but for now you might want to either A. ran alignment manually (saving it as align2genome.bam under output folder will make FLAMES skip alignment) or B. make a folder that contains (softlinked) minimap2 binary and the latest js script.

@nick-youngblut
Copy link
Author

Thanks @ChangqingW for the info!

and use GTF instead of GFF

Why was the GTF needed instead of the GFF? Is this info in the docs? Maybe it would be good to include such info in the error message for if (id == null) throw Error("No transcript_id"); (actually catch the error and write out a useful message)?

I should note that although the if (id == null) throw Error("No transcript_id"); message occurs rather early in the pipeline, the pipeline continues to run for quite a while before it actually fails. It would help if the pipeline failed fast.

@ChangqingW
Copy link
Collaborator

Thanks @ChangqingW for the info!

and use GTF instead of GFF

Why was the GTF needed instead of the GFF? Is this info in the docs? Maybe it would be good to include such info in the error message for if (id == null) throw Error("No transcript_id"); (actually catch the error and write out a useful message)?

I should note that although the if (id == null) throw Error("No transcript_id"); message occurs rather early in the pipeline, the pipeline continues to run for quite a while before it actually fails. It would help if the pipeline failed fast.

This is a known issue in minimap2: lh3/minimap2#422 (comment)
I have no plans on modifying the js scripts from minimap2, but yes maybe I can catch the error in R and suggests using GTF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants