Skip to content

xyz1396/Meta-proteomics-analysis-pipeline-based-on-Proteome-Discovery-output

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Tale of Two Databases 📚

Yi Xiong 2020/11/25

  • The Meta DB and the Public DB can both perform well in soil proteomics 😊

  • Meta proteomics analysis pipeline based on Proteome Discovery output 💡

  • Xiong Yi, Zheng Lu, Meng Xiangxiang, et al. Protein sequence databases generated from metagenomics and public database produced similar soil metaproteomic results of microbial taxonomic and functional changes. Pedosphere, 2021 ✌️

    • Reproduce this study
mkdir Proteomics
cd Proteomics
git clone https://github.com/xyz1396/Meta-proteomics-analysis-pipeline-based-on-Proteome-Discovery-output
# Run the rmd files in Rstudio
  • data:

    • input dataset used in this study (large files like database in fasta and blast XML file are ignorged)
  • figure:

    • output pictures used in the paper
  • script:

    • rmd files storing the code to analyze the data
  • table:

    • output tables used in the paper
  • temp:

    • temporary files generated in this study
  1. Download mass spectral data and build databases

  2. Calculate the protein sequence length of the two databases in detail

  3. Get Identified proteins, Coverage, Length, UpsetR, Venn

 compare length of identified proteins  
  1. Volcano Plot

Meta DB VolcanoPlot  
  1. Statistical comparison of microbial species identified by the two databases

Correlation of abundance identified by the two databases in LP with histogram 
  1. KEGG level3 annotation statistics of the two databases

Correlation of KEGG level3 identified by the two databases in HP with histogram 
  1. Summary statistics of annotations

Summary statistics of protein annotations 
  1. Enrichment analysis

Enrichment analysis results 
  1. Draw phosphatase and phosphatase evolutionary tree and add species annotation

  2. All identified proteins’s heatmap

The heatmap of abundance of protein identified by the Meta DB 
  1. Blast the protein sequences identified by the 2 databases to each other

Percentage of identical matches 
  1. Statistical comparison of PSMs of microbial species identified by the two databases

PSMs of genera identified only by Meta DB and both DB 
  1. GO annotations of proteins with significantly differential abundance identified by the 2 DBs For Fig. S7

The amount of proteins with significantly differential abundance identified by the Public DB

About

The Impact of Protein Sequence Databases on Soil Metaproteomic Results

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published