Skip to content

Commit

Permalink
Merge pull request #78 from PacificBiosciences/develop
Browse files Browse the repository at this point in the history
v1.0.3
  • Loading branch information
williamrowell authored Oct 20, 2023
2 parents 9b1fd57 + 6daf31b commit b6a2cd2
Show file tree
Hide file tree
Showing 5 changed files with 15 additions and 9 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ PacBio WGS Variant Pipeline performs read alignment, variant calling, and phasin

## Setup

Some tasks and workflows are pulled in from other repositories. Ensure you have initialized submodules following cloning by running `git submodule update --init --recursive`.
We recommend cloning the repo rather than downloading the release package. Some tasks and workflows are pulled in from other repositories. Ensure you have initialized submodules following cloning by running `git submodule update --init --recursive`.

## Resource requirements

Expand Down Expand Up @@ -117,7 +117,7 @@ A cohort can include one or more samples. Samples need not be related, but if yo
| :- | :- | :- | :- |
| String | cohort_id | A unique name for the cohort; used to name outputs | |
| Array[[Sample](#sample)] | samples | The set of samples for the cohort. At least one sample must be defined. | |
| Array[String] | phenotypes | [Human Phenotype Ontology (HPO) phenotypes](https://hpo.jax.org/app/) associated with the cohort. If no particular phenotypes are desired, the root HPO term, `HP:0000001`, can be used. | |
| Array[String] | phenotypes | [Human Phenotype Ontology (HPO) phenotypes](https://hpo.jax.org/app/) associated with the cohort. If no particular phenotypes are desired, the root HPO term, `"HP:0000001"`, can be used. | |

### [Sample](workflows/humanwgs_structs.wdl)

Expand All @@ -140,7 +140,7 @@ These files are hosted publicly in each of the cloud backends; see `backends/${b

| Type | Name | Description | Notes |
| :- | :- | :- | :- |
| String | name | Reference name; used to name outputs (e.g., "GRCh38") | |
| String | name | Reference name; used to name outputs (e.g., "GRCh38") | Note: The workflow currently only supports GRCh38 and provides GCA_000001405.15_GRCh38_no_alt_analysis_set. |
| [IndexData](https://github.com/PacificBiosciences/wdl-common/blob/main/wdl/structs.wdl) | fasta | Reference genome and index | |
| File | tandem_repeat_bed | Tandem repeat locations used by [pbsv](https://github.com/PacificBiosciences/pbsv) to normalize SV representation | |
| File | trgt_tandem_repeat_bed | Tandem repeat sites to be genotyped by [TRGT](https://github.com/PacificBiosciences/trgt) | |
Expand Down Expand Up @@ -176,7 +176,7 @@ These files are hosted publicly in each of the cloud backends; see `backends/${b
| [DeepVariantModel](https://github.com/PacificBiosciences/wdl-common/blob/main/wdl/structs.wdl)? | deepvariant_model | Optional alternate DeepVariant model file to use | |
| Int? | pbsv_call_mem_gb | Optionally set RAM (GB) for pbsv_call during cohort analysis | |
| Int? | glnexus_mem_gb | Optionally set RAM (GB) for GLnexus during cohort analysis | |
| Boolean? | run_tertiary_analysis | Run the optional tertiary analysis steps \[false\] | |
| Boolean? | run_tertiary_analysis | Run the optional tertiary analysis steps \[false\] | \[true, false\] |
| String | backend | Backend where the workflow will be executed | \["Azure", "AWS", "GCP", "HPC"\] |
| String? | zones | Zones where compute will take place; required if backend is set to 'AWS' or 'GCP'. | <ul><li>[Determining available zones in AWS](backends/aws/README.md#determining-available-zones)</li><li>[Determining available zones in GCP](backends/gcp/README.md#determining-available-zones)</li></ul> |
| String? | aws_spot_queue_arn | Queue ARN for the spot batch queue; required if backend is set to 'AWS' and `preemptible` is set to `true` | [Determining the AWS queue ARN](backends/aws/README.md#determining-the-aws-batch-queue-arn) |
Expand Down
4 changes: 2 additions & 2 deletions wdl-ci.config.json
Original file line number Diff line number Diff line change
Expand Up @@ -419,7 +419,7 @@
"tasks": {
"pbsv_call": {
"key": "pbsv_call",
"digest": "77yon47d6t327ocrw6bed3dccyq5t3va",
"digest": "o5xv2etbm2j4s32d5xs626xj6sp2ykmj",
"tests": [
{
"inputs": {
Expand Down Expand Up @@ -457,7 +457,7 @@
"tasks": {
"concat_vcf": {
"key": "concat_vcf",
"digest": "ntfiawmetxbdacle2l7mpu5tkz2jmtz2",
"digest": "xkyvutmrg3gz6zgabdmwcjvcbwrbwwp7",
"tests": [
{
"inputs": {
Expand Down
5 changes: 4 additions & 1 deletion workflows/cohort_analysis/cohort_analysis.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@ workflow cohort_analysis {
File gvcf_index = gvcf_object.data_index
}

scatter (region_set in pbsv_splits) {
scatter (shard_index in range(length(pbsv_splits))) {
Array[String] region_set = pbsv_splits[shard_index]

call PbsvCall.pbsv_call {
input:
sample_id = cohort_id + ".joint",
Expand All @@ -41,6 +43,7 @@ workflow cohort_analysis {
reference = reference.fasta.data,
reference_index = reference.fasta.data_index,
reference_name = reference.name,
shard_index = shard_index,
regions = region_set,
mem_gb = pbsv_call_mem_gb,
runtime_attributes = default_runtime_attributes
Expand Down
5 changes: 4 additions & 1 deletion workflows/sample_analysis/sample_analysis.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -68,14 +68,17 @@ workflow sample_analysis {
runtime_attributes = default_runtime_attributes
}
scatter (region_set in pbsv_splits) {
scatter (shard_index in range(length(pbsv_splits))) {
Array[String] region_set = pbsv_splits[shard_index]

call PbsvCall.pbsv_call {
input:
sample_id = sample.sample_id,
svsigs = pbsv_discover.svsig,
reference = reference.fasta.data,
reference_index = reference.fasta.data_index,
reference_name = reference.name,
shard_index = shard_index,
regions = region_set,
runtime_attributes = default_runtime_attributes
}
Expand Down

0 comments on commit b6a2cd2

Please sign in to comment.