This repository contains materials and code for the manuscript:
Genome-microbiome interplay provides insight into the determinants of the human blood metabolome
Christian Diener#, Chengzhen L. Dai#, Tomasz Wilmanski, Priyanka Baloni, Brett Smith, Noa Rappaport, Leroy Hood, Andrew T. Magis and Sean M. Gibbons
https://doi.org/10.1101/2022.02.04.479172
root
> notebook.ipynb # Analysis steps in Jupyter notebooks
> workflow.nf # Nextflow workflow for larger analyses
> figures
+ fig1.png # Figure 1 of the manuscript
+ fig1.svg # SVG source file for multi-panel figures
[...]
> data
+ data.csv # Intermediate data files or results tables
[...]
[...]
All dependencies can be installed with the provided conda environment.
conda env create -f conda.yml
This will create an environment gxe2021
that can be used to run all analyses.
conda activate gxe2021
The analyses from the manunscript can be reproduced with the following steps. This will require obtaining the data from the Arivale cohort first.
Qualified researchers can access the full Arivale deidentified dataset supporting the findings in this study for research purposes through signing a Data Use Agreement (DUA). Inquiries to access the data can be made at data-access{at}isbscience.org and will be responded to within 7 business days.
Raw sequencing data for the 16S amplicon sequencing has been deposited to the SRA. A full list of run accessions can be found in the data
directory here.
- Assembly of the cohort | notebook
- Confounder adjustment | notebook
- Microbiome-metabolite associations for the training cohort | notebook
- mGWAS for the training cohort | workflow
Run this withnextflow run -resume gwas.nf
- Inspect results from the mGWAS | notebook
- Fit models and obtain R2 for training cohort | notebook
- Obtain out-of-sample R2 for validation cohort | notebook
- Detailed analysis of bile acid and sphingolipid R2 | notebook
- Fit genome-microbiome-metabolome interactions | notebook
- Analyze the obtained interactions | notebook