Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFFParser error with example data in sqanti_qc #247

Closed
chaesee1 opened this issue Jan 22, 2024 · 2 comments
Closed

GFFParser error with example data in sqanti_qc #247

chaesee1 opened this issue Jan 22, 2024 · 2 comments

Comments

@chaesee1
Copy link

Hello,

I got error that sqanti_qc.py with example data. the error is as follows:

(SQANTI3.env) [SQANTI3-5.2]$ sqanti3_qc.py example/UHR_chr22.gtf example/gencode.v38.basic_chr22.gtf example/GRCh38.p13_chruman.refTSS_v3.1.hg38.bed --polyA_motif_list data/polyA_motifs/mouse_and_human.polyA_motif.txt -o UHR_chr22 -d example/SQANTI3_QC_output2 -fl exUHR_chr22_short_reads.fofn --cpus 4 --report both
Rscript (R) version 4.3.1 (2023-06-16)
WARNING: output directory /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2 already exists. Overwriting!
Write arguments to /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/UHR_chr22.params.txt...
**** Running SQANTI3...
**** Parsing provided files....
Reading genome fasta /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/GRCh38.p13_chr22.fasta....
Error corrected FASTA /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/UHR_chr22_corrected.fasta already exists. Using it.
**** Predicting ORF sequences...
ORF file /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/UHR_chr22_corrected.faa already exists. Using it....
**** Parsing Reference Transcriptome....
/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/refAnnotation_UHR_chr22.genePred already exists. Using it.
**** Parsing Isoforms....
**** Running STAR for calculating Short-Read Coverage.
START running STAR...
Index identified. Proceeding to mapping.
Mapping for UHR_Rep1_chr22 : in progress...
Mapping for UHR_Rep1_chr22 : done.
/home/denovo/anaconda3/envs/SQANTI3.env/bin/STAR-avx2 --runThreadN 4 --genomeDir /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQA/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/UHR_Rep1_chr22.R1.fastq.gz example/UHR_Rep1_chr22.R2.fastq.gz --outFileNamePrefix /ess/dlstibm/AQC_output2/STAR_mapping/UHR_Rep1_chr22 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterType BySJout --outSAMunmapped Within --outFilt 0.04 --outFilterMismatchNmax 999 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --sjdbScore 1 --genomeLoad NoSharedMemFilesCommand zcat --twopassMode Basic
STAR version: 2.7.11a compiled: 2023-09-15T02:58:53+0000 :/opt/conda/conda-bld/star_1694746407721/work/source
Jan 22 17:15:03 ..... started STAR run
Jan 22 17:15:03 ..... loading genome
Jan 22 17:15:04 ..... started 1st pass mapping
Jan 22 17:15:46 ..... finished 1st pass mapping
Jan 22 17:15:46 ..... inserting junctions into the genome indices
Jan 22 17:15:55 ..... started mapping
Jan 22 17:16:39 ..... finished mapping
Jan 22 17:16:39 ..... started sorting BAM
Jan 22 17:16:40 ..... finished successfully
Mapping for UHR_Rep2_chr22 : in progress...
Mapping for UHR_Rep2_chr22 : done.
/home/denovo/anaconda3/envs/SQANTI3.env/bin/STAR-avx2 --runThreadN 4 --genomeDir /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQA/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/UHR_Rep2_chr22.R1.fastq.gz example/UHR_Rep2_chr22.R2.fastq.gz --outFileNamePrefix /ess/dlstibm/AQC_output2/STAR_mapping/UHR_Rep2_chr22 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterType BySJout --outSAMunmapped Within --outFilt 0.04 --outFilterMismatchNmax 999 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --sjdbScore 1 --genomeLoad NoSharedMemFilesCommand zcat --twopassMode Basic
STAR version: 2.7.11a compiled: 2023-09-15T02:58:53+0000 :/opt/conda/conda-bld/star_1694746407721/work/source
Jan 22 17:16:40 ..... started STAR run
Jan 22 17:16:40 ..... loading genome
Jan 22 17:16:41 ..... started 1st pass mapping
Jan 22 17:17:15 ..... finished 1st pass mapping
Jan 22 17:17:15 ..... inserting junctions into the genome indices
Jan 22 17:17:24 ..... started mapping
Jan 22 17:17:57 ..... finished mapping
Jan 22 17:17:57 ..... started sorting BAM
Jan 22 17:17:57 ..... finished successfully
Input pattern: /ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/STAR_mapping/.
The following files found and to be read as junctions:
/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/STAR_mapping/UHR_Rep2_chr22SJ.out.tab
/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/example/SQANTI3_QC_output2/STAR_mapping/UHR_Rep1_chr22SJ.out.tab
6762 junctions read. 2 junctions added to both strands because no strand information from STAR.
Running calculation of TSS ratio
Traceback (most recent call last):
File "/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/sqanti3_qc.py", line 2542, in
main()
File "/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/sqanti3_qc.py", line 2525, in main
run(args)
File "/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/sqanti3_qc.py", line 1888, in run
isoforms_info, ratio_TSS_dict = isoformClassification(args, isoforms_by_chr, refs_1exon_by_chr, refs_exons_by_chr, junctions_by_chr, junctions_by_gene, start_ends_by_gene, genome_dict, indelsJunc, orfDict, corrGTF)
File "/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/sqanti3_qc.py", line 1558, in isoformClassification
inside_bed, outside_bed = get_TSS_bed(corrGTF, chr_order)
File "/ess/dlstibm/Application/IsoSeq/SQANTI3-5.2/utilities/short_reads.py", line 122, in get_TSS_bed
for rec in BCBio_GFF.parse(in_handle, limit_info=limit_info, target_lines=1):
File "/home/denovo/anaconda3/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 793, in parse
for rec in parser.parse_in_parts(gff_files, base_dict, limit_info,
File "/home/denovo/anaconda3/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 337, in parse_in_parts
cur_dict = self._results_to_features(cur_dict, results)
File "/home/denovo/anaconda3/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 376, in _results_to_features
base = self._add_parent_child_features(base, results.get('parent', []),
File "/home/denovo/anaconda3/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 448, in _add_parent_child_features
child_feature = self._get_feature(child_dict)
File "/home/denovo/anaconda3/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 591, in _get_feature
new_feature = SeqFeature.SeqFeature(location, feature_dict['type'],
TypeError: SeqFeature.init() got an unexpected keyword argument 'strand'

Thanks,

  • chaehwa.
@chaesee1 chaesee1 changed the title GFFParser error in sqanti_qc GFFParser error with Test data in sqanti_qc Jan 23, 2024
@chaesee1 chaesee1 changed the title GFFParser error with Test data in sqanti_qc GFFParser error with example data in sqanti_qc Jan 23, 2024
@xiuru
Copy link

xiuru commented Jan 26, 2024

I encountered the same issue. Did you manage to resolve it?

@SamGallaher
Copy link

This appears to be an issue with the GFFparser.py file in BCBio and compatibility with biopython- see here
They suggest a workaround by installing biopython<=1.81

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants