generated from nextstrain/pathogen-repo-guide
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #3 from nextstrain/make-divergence-phylogeny
Make divergence phylogeny
- Loading branch information
Showing
13 changed files
with
724 additions
and
90 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,4 @@ | ||
# CHANGELOG | ||
|
||
We use this CHANGELOG to document breaking changes, new features, bug fixes, | ||
and config value changes that may affect both the usage of the workflows and | ||
the outputs of the workflows. See the [changelog for the ncov | ||
repository](https://github.com/nextstrain/ncov/blob/HEAD/docs/src/reference/change_log.md) | ||
for an example of formatting. | ||
* 12 August 2024: Create a full genome phylogeny for rabies [PR#3](https://github.com/nextstrain/rabies/pull/3) | ||
* 25 July 2024: Add CI GH Action workflow to test the ingest workflow [PR#6](https://github.com/nextstrain/rabies/pull/6) | ||
* 15 July 2024: Make rabies-specific modifications to the ingest directory (which originated from the pathogen-repo-guide) [PR#2](https://github.com/nextstrain/rabies/pull/2) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,25 @@ | ||
# Nextstrain repository for rabies virus | ||
This repository contains two workflows for the analysis of rabies virus data: | ||
|
||
This repo is under development. | ||
- [`ingest/`](./ingest) - Download data from GenBank, clean and curate it and upload it to S3 | ||
- [`phylogenetic/`](./phylogenetic) - Filter sequences, align, construct phylogeny and export for visualization | ||
|
||
Each folder contains a README.md with more information. The results of running both workflows are publicly visible at [nextstrain.org/rabies](https://nextstrain.org/rabies). | ||
|
||
## Installation | ||
|
||
Follow the [standard installation instructions](https://docs.nextstrain.org/en/latest/install.html) for Nextstrain's suite of software tools. | ||
|
||
## Quickstart | ||
|
||
Run the default phylogenetic workflow via: | ||
``` | ||
cd phylogenetic/ | ||
nextstrain build . | ||
nextstrain view . | ||
``` | ||
|
||
## Documentation | ||
|
||
- [Running a pathogen workflow](https://docs.nextstrain.org/en/latest/tutorials/running-a-workflow.html) | ||
- [Contributor documentation](./CONTRIBUTING.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
{ | ||
"title": "Real-time tracking of rabies full genome virus evolution", | ||
"maintainers": [ | ||
{"name": "Kim Andrews", "url": "https://bedford.io/team/kim-andrews/"}, | ||
{"name": "the Nextstrain team", "url": "https://nextstrain.org/team"} | ||
], | ||
"data_provenance": [ | ||
{ | ||
"name": "GenBank", | ||
"url": "https://www.ncbi.nlm.nih.gov/genbank/" | ||
} | ||
], | ||
"build_url": "https://github.com/nextstrain/rabies", | ||
"colorings": [ | ||
{ | ||
"key": "gt", | ||
"title": "Genotype", | ||
"type": "categorical" | ||
}, | ||
{ | ||
"key": "region", | ||
"title": "Region", | ||
"type": "categorical" | ||
}, | ||
{ | ||
"key": "country", | ||
"title": "Country", | ||
"type": "categorical" | ||
}, | ||
{ | ||
"key": "host", | ||
"title": "Host", | ||
"type": "categorical" | ||
} | ||
], | ||
"geo_resolutions": [ | ||
"country", | ||
"region" | ||
], | ||
"display_defaults": { | ||
"map_triplicate": true, | ||
"color_by": "region" | ||
}, | ||
"filters": [ | ||
"region", | ||
"country", | ||
"author" | ||
], | ||
"metadata_columns": [ | ||
"author", | ||
"strain", | ||
"division", | ||
"location" | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Regions | ||
region Asia #447CCD | ||
region Oceania #5EA9A1 | ||
region Africa #8ABB6A | ||
region Europe #BEBB48 | ||
region South America #E29E39 | ||
region North America #E2562B |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,14 @@ | ||
# This configuration file should contain all required configuration parameters | ||
# for the phylogenetic workflow to run to completion. | ||
# | ||
# Define optional config parameters with their default values here so that users | ||
# do not have to dig through the workflows to figure out the default values | ||
strain_id_field: "accession" | ||
files: | ||
exclude: "defaults/dropped_strains.txt" | ||
reference: "defaults/rabies_reference.gb" | ||
colors: "defaults/colors.tsv" | ||
auspice_config: "defaults/auspice_config.json" | ||
description: "defaults/description.md" | ||
filter: | ||
group_by: "country year" | ||
sequences_per_group: 20 | ||
min_date: 1950 | ||
min_length: 5000 | ||
ancestral: | ||
inference: "joint" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequences and metadata for sharing their work. Please note that although data generators have generously shared data in an open fashion, that does not mean there should be free license to publish on this data. Data generators should be cited where possible and collaborations should be sought in some circumstances. Please try to avoid scooping someone else's work. Reach out if uncertain. | ||
|
||
|
||
#### Analysis | ||
Our bioinformatic processing workflow can be found at [github.com/nextstrain/rabies](https://github.com/nextstrain/rabies) and includes: | ||
- sequence alignment by [augur align](https://docs.nextstrain.org/projects/augur/en/stable/usage/cli/align.html) | ||
- phylogenetic reconstruction using [IQTREE-2](http://www.iqtree.org/) | ||
- ancestral state reconstruction and temporal inference using [TreeTime](https://github.com/neherlab/treetime) | ||
|
||
#### Underlying data | ||
We curate sequence data and metadata from NCBI as starting point for our analyses. Curated sequences and metadata are available as flat files at: | ||
- [data.nextstrain.org/files/workflows/rabies/sequences.fasta.zst](https://data.nextstrain.org/files/workflows/rabies/sequences.fasta.zst) | ||
- [data.nextstrain.org/files/workflows/rabies/metadata.tsv.zst](https://data.nextstrain.org/files/workflows/rabies/metadata.tsv.zst) | ||
|
||
--- | ||
|
||
Screenshots may be used under a [CC-BY-4.0 license](https://creativecommons.org/licenses/by/4.0/) and attribution to nextstrain.org must be provided. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Incorrect virus species assignment: | ||
MF197744 # Lyssavirus bokeloh | ||
MF197745 # Lyssavirus bokeloh | ||
# | ||
# Large number of mutations: | ||
MK920923 # Sample from kinkajou host, Rocha et al. 2020: https://www.tandfonline.com/doi/full/10.1080/22221751.2020.1759380 |
Oops, something went wrong.