Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSPS-326 add wdl to mask or subset a vcf given a bed file #136

Merged
merged 4 commits into from
Sep 25, 2024

Conversation

jsotobroad
Copy link
Collaborator

Description

We want to use this wdl for scientific validation of the refernece panel. We will mask wgs samples down to an array build's sites and then impute and we shoudl get something close to the original wgs sample

Example of workflow running - https://bvdp-saturn-dev.appspot.com/#workspaces/general-dev-billing-account/tsps_gcp_scratch_space_mma/job_history/dd9e56bd-63d6-43d4-bd9f-93caf91dcdae

Jira Ticket

https://broadworkbench.atlassian.net/browse/TSPS-326

Copy link
Collaborator

@mmorgantaylor mmorgantaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the only thing we care about that the proper sites are filtered down? do we care about headers or annotations or anything like that that are expected for an array but wouldn't be present in a WGS vcf?

have you run the output of this through imputation?

@@ -13,3 +13,18 @@ This wdl is basically a wrapper around that tool/image
#### Outputs
* recombined_reference_panel - output vcf after mitigation
algorithm has been run


## ReshapeReferencePanel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## ReshapeReferencePanel
## SubsetVcfByBedFile

@jsotobroad
Copy link
Collaborator Author

Is the only thing we care about that the proper sites are filtered down? do we care about headers or annotations or anything like that that are expected for an array but wouldn't be present in a WGS vcf?

have you run the output of this through imputation?

Outside of the dictionary lines im not sure if there are any other significant headers that we should care about

Copy link
Collaborator

@mmorgantaylor mmorgantaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a couple lines in the readme to note that we don't care about headers/annotations because the imputation tools don't care about those either.

Copy link

sonarcloud bot commented Sep 25, 2024

@jsotobroad jsotobroad merged commit 7a5039e into main Sep 25, 2024
12 checks passed
@jsotobroad jsotobroad deleted the js_TSPS-326 branch September 25, 2024 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants