This page contains brief information and files from the consortium analysis.
We calculated sequencing statistics using marginStats on alignment of native RNA and cDNA reads to gencode.v27.transcripts.fa using minimap2 (-ax map-ont mode). We created a summary of unique genes and isoforms detecte by native RNA sequence data upon alignments fo the GENCODE v27 reference sequence set. Additionally, we calculated 5mers in sequence data relative to FLAIR high-confidence reference isoforms.
We used GENCODE v24 to align the pass native RNA and pass cDNA reads.
Isoforms defined from FLAIR v1.1 are in PSL format. From the native RNA data, we generated two sets of isoforms: set A and set B. Set A contains all isoforms (71,899 isoforms) from default FLAIR output with a minimum of 5 supporting reads, as described in the Online Methods. Set B (50,039 isoforms) is a more stringent set of nvRNA isoforms than set A. To generate set B, set A isoforms that are subsets of longer set A isoforms are removed and only the longest isoform of each unique splice junction chain is retained. We also have a set of FLAIR isoforms defined from cDNA data (99,574 isoforms).
- poly-A all
- poly-A primary
- A reproducible pipeline for generating all poly(A) calls from the fast5 data associated to the Oxford Nanopore RNA standards is available at this repository.
In order to create indexing files to run nanopolish eventalign, then to parse out our alignments by kmer position and associate ionic current information, we used the following commands:
nanopolish index -d /path/to/raw_fast5s/ -s sequencing_summary.txt reads.fastq
nanopolish eventalign --scale-events -n -t 8 --reads reads.fastq --bam reads.bam --genome GRCh38.fasta
Here is the zip file containing nanopolish indexes for the native RNA and IVT RNA data. These were used for ionic current-level analyses of m6A and inosine modifications. Once downloaded, edit the paths appropriately for your usage.