Skip to content

Pipeline to process 16S (V6) sequence data coming from Ion-Torrent sequencer

Notifications You must be signed in to change notification settings

QTB-HHU/16SV6-Sequence-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

16SV6-Sequence-Analysis

Pipeline to process 16S (V6) sequence data coming from Ion-Torrent sequencer Welcome to the 16SV6-Sequence-Analysis wiki!

The following steps describe the software and commands that were used in order to analyze the bacterial 16S V6 region obtain from a non-axenic culture of Phaeodactylum tricornutum.

Corresponding article: Dynamics of the bacterial community associated with Phaeodactylum tricornutum cultures Fiona Wanjiku Moejes, Ovidiu Popa, Antonella Succurro, Julie Maguireand Oliver Ebenhöh

Input:

Ion-Torrent FASTQ files containing raw reads without barcodes and adapter sequence. Primer are still attached and must be removed. Fasta file containing the primer pattern 'mapping' file containing sample information, needed for data evaluation.

Pipeline is adapted and modified from http://www.brmicrobiome.org/#!16s-profiling-ion-torrent-new/cpdg Data analysis for 16S microbial profiling from different benchtop sequencing platforms. J Microbiol Methods. 2014. doi: 10.1016/j.mimet.2014.08.018.

Software:

fastq-mcf Version: 1.04.807 (ref 1)

QIIMEfastaFormatter.pl (George Watts, University of Arizona 2013)

usearch v8.0.1517 (32Bit – opensource) (ref 2a,b)

Qiime v. 1.9.0 (ref 3)

Processing Steps:

Primer trimming fastq-mcf -0 -f -l 20 -o $outfile primer.fa $input

Length trimming (based on the distribution of V6 from VAMPS (ref 4)) fastq-mcf -0 -l 52 -L 61 -o $outfile /dev/null $input

Quality filtering -USING USEARCH v8- usearch -fastq_filter $PWD/reads2.fastq -fastq_maxee 0.5 -fastqout reads.fa

Dereplication -USING USEARCH v8- usearch -derep_fulllength $PWD/reads.fa -output derep.fa -sizeout

Abundance sort and discard singletons -USING USEARCH v8- usearch -sortbysize $PWD/derep.fa -output sorted.fa -minsize 2

OTU clustering using UPARSE method -USING USEARCH v8- usearch -cluster_otus $PWD/sorted.fa -otus otus1.fa

Chimera filtering using reference rdp.gold.fa database -USING USEARCH v8- usearch -uchime_ref $PWD/otus1.fa -db $PWD/rdp_gold.fa -strand plus -nonchimeras otus.fa

Map reads back to OTU database -USEARCH v8 script- usearch -usearch_global $PWD/reads.fa -db $PWD/otus.fa -strand plus -id 0.97 -uc map.uc

Assign taxonomy to OTUS using uclust method on QIIME (use the file “otus.fa” from UPARSE as input file) to speed up use parallel_assign_taxonomy_uclust.py from QIIME

assign_taxonomy.py -i $PWD/otus.fa -o output -r $PWD/rep_set/97_otus.fasta -t $PWD/taxonomy/97_otu_taxonomy.txt

Align sequences on QIIME, using SILVA reference sequences (use the file “otus.fa” from UPARSE as input file) align_seqs.py -i $PWD/otus.fa -o rep_set_align -t $PWD/rep_set_aligned/97_otus.fasta

Filter alignments on QIIME filter_alignment.py -i $PWD/otus_aligned.fasta -o filtered_alignment

Make the reference tree on QIIME make_phylogeny.py -i $PWD/otus_aligned_pfiltered.fasta -o rep_set.tre

Convert UC to otu-table.txt -UPARSE PYTHON SCRIPT- python $PWD/uc2otutab.py $PWD/map.uc > otu_table.txt

Convert otu_table.txt to otu-table.biom, used by QIIME -BIOM SCRIPT- biom convert -i $PWD/otu_table.txt -o otu_table.biom --table-type="otu table"

Add metadata (taxonomy) to OTU table biom add-metadata -i $PWD/otu_table.biom -o otu_table_tax.biom --observation-metadata-fp $PWD/otus_tax_assignments.txt --observation-header OTUID,taxonomy,confidence --sc-separated taxonomy --float-fields confidence

Check OTU Table on QIIME biom summarize-table -i $PWD/otu_table_tax.biom -o results_biom_table

Run diversity analyses on QIIME (or any other analysis of your choice). The parameter “-e” is the sequencing depth to use for even sub-sampling and maximum rarefaction depth. You should review the output of the ‘biom summarize-table’ (step 15) command to decide on this value. core_diversity_analyses.py -i $PWD/otu_table_tax.biom -m $PWD/mapping_file.txt -t $PWD/rep_set.tre -e xxxx -o $PWD/core_output

References

Erik Aronesty (2011). ea-utils : "Command-line tools for processing biological sequencing data"; http://code.google.com/p/ea-utils 2a) Edgar,RC (2010) Search and clustering orders of magnitude faster than BLAST, Bioinformatics 26(19), 2460-2461. doi: 10.1093/bioinformatics/btq461 http://drive5.com/usearch 2b) Edgar, R.C. (2013) UPARSE: Highly accurate OTU sequences from microbial amplicon reads, Nature Methods [Pubmed:23955772, dx.doi.org/10.1038/nmeth.2604] http://drive5.com/usearch J Gregory Caporaso, Justin Kuczynski, Jesse Stombaugh, Kyle Bittinger, Frederic D Bushman, Elizabeth K Costello, Noah Fierer, Antonio Gonzalez Pena, Julia K Goodrich, Jeffrey I Gordon, Gavin A Huttley, Scott T Kelley, Dan Knights, Jeremy E Koenig, Ruth E Ley, Catherine A Lozupone, Daniel McDonald, Brian D Muegge, Meg Pirrung, Jens Reeder, Joel R Sevinsky, Peter J Turnbaugh, William A Walters, Jeremy Widmann, Tanya Yatsunenko, Jesse Zaneveld and Rob Knight; Nature Methods, 2010; doi:10.1038/nmeth.f.303 http://qiime.org/index.html Huse SM, Mark Welch DB, Voorhis A, Shipunova A, Morrison HG, et al. (2014) VAMPS: a website for visualization and analysis of microbial population structures. BMC Bioinformatics 15: 41. https://vamps.mbl.edu/resources/databases.php

About

Pipeline to process 16S (V6) sequence data coming from Ion-Torrent sequencer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published