-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Harmonize ingest with pathogen repo guide #35
- Loading branch information
Showing
13 changed files
with
145 additions
and
196 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# This configuration file should contain all required configuration parameters | ||
# for the ingest workflow to run with additional Nextstrain automation rules. | ||
|
||
# Custom rules to run as part of the Nextstrain automated workflow | ||
# The paths should be relative to the ingest directory. | ||
custom_rules: | ||
- build-configs/nextstrain-automation/upload.smk | ||
|
||
# Nextstrain CloudFront domain to ensure that we invalidate CloudFront after the S3 uploads | ||
# This is required as long as we are using the AWS CLI for uploads | ||
cloudfront_domain: "data.nextstrain.org" | ||
|
||
# Nextstrain AWS S3 Bucket with pathogen prefix | ||
s3_dst: "s3://nextstrain-data/files/workflows/dengue" | ||
|
||
# Mapping of files to upload | ||
files_to_upload: | ||
genbank.ndjson.xz: data/genbank.ndjson | ||
all_sequences.ndjson.xz: data/sequences.ndjson | ||
metadata_all.tsv.zst: results/metadata_all.tsv | ||
sequences_all.fasta.zst: results/sequences_all.fasta | ||
metadata_denv1.tsv.zst: results/metadata_denv1.tsv | ||
sequences_denv1.fasta.zst: results/sequences_denv1.fasta | ||
metadata_denv2.tsv.zst: results/metadata_denv2.tsv | ||
sequences_denv2.fasta.zst: results/sequences_denv2.fasta | ||
metadata_denv3.tsv.zst: results/metadata_denv3.tsv | ||
sequences_denv3.fasta.zst: results/sequences_denv3.fasta | ||
metadata_denv4.tsv.zst: results/metadata_denv4.tsv | ||
sequences_denv4.fasta.zst: results/sequences_denv4.fasta |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
""" | ||
This part of the workflow handles uploading files to AWS S3. | ||
Files to upload must be defined in the `files_to_upload` config param, where | ||
the keys are the remote files and the values are the local filepaths | ||
relative to the ingest directory. | ||
Produces a single file for each uploaded file: | ||
"results/upload/{remote_file}.upload" | ||
The rule `upload_all` can be used as a target to upload all files. | ||
""" | ||
import os | ||
|
||
|
||
rule upload_to_s3: | ||
input: | ||
file_to_upload=lambda wildcards: config["files_to_upload"][wildcards.remote_file], | ||
output: | ||
"results/upload/{remote_file}.upload", | ||
params: | ||
quiet="" if send_notifications else "--quiet", | ||
s3_dst=config["s3_dst"], | ||
cloudfront_domain=config["cloudfront_domain"], | ||
shell: | ||
""" | ||
./vendored/upload-to-s3 \ | ||
{params.quiet} \ | ||
{input.file_to_upload:q} \ | ||
{params.s3_dst:q}/{wildcards.remote_file:q} \ | ||
{params.cloudfront_domain} 2>&1 | tee {output} | ||
""" | ||
|
||
|
||
rule upload_all: | ||
input: | ||
uploads=[ | ||
f"results/upload/{remote_file}.upload" | ||
for remote_file in config["files_to_upload"].keys() | ||
], | ||
output: | ||
touch("results/upload_all.done") |
This file was deleted.
Oops, something went wrong.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.