Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add variants_filter_lcr, variants_filter_segdup and variants_snv_only options to annotate_sex to filter variants prior to variant only ploidy imputation #479

Merged
merged 14 commits into from
Sep 21, 2022

Conversation

jkgoodrich
Copy link
Contributor

@jkgoodrich jkgoodrich commented Sep 14, 2022

NOTE: This PR is stacked on the add_gaussian_karyotype PR #478

If helpful for testing:

import hail as hl
from gnomad_qc.v4.resources.basics import (
    calling_intervals,
    get_gnomad_v4_vds,
)
from bokeh.plotting import output_notebook, show

hl.init(default_reference='GRCh38')
output_notebook()

vds = get_gnomad_v4_vds(
    remove_hard_filtered_samples=False,
    remove_hard_filtered_samples_no_sex=True,
    test=True,
)
mt = hl.vds.to_merged_sparse_mt(vds)
calling_intervals=calling_intervals(interval_name='intersection', calling_interval_padding=50).ht()
freq_ht = hl.read_table("gs://gnomad/v4.0/sample_qc/exomes/gnomad.exomes.v4.0.f_stat_sites.ht")
sex_ht = annotate_sex(vds, included_intervals=calling_intervals, sites_ht=freq_ht, f_stat_cutoff=-1, aaf_expr="AF", gt_expr="LGT", use_gaussian_mixture_model=True)
sex_ht2 = annotate_sex(mt, included_intervals=calling_intervals, sites_ht=freq_ht, f_stat_cutoff=-1, aaf_expr="AF", gt_expr="LGT", use_gaussian_mixture_model=True)
sex_ht3 = annotate_sex(vds, included_intervals=calling_intervals, sites_ht=freq_ht, f_stat_cutoff=-1, aaf_expr="AF", gt_expr="LGT", use_gaussian_mixture_model=True, variants_only_x_ploidy=True)
sex_ht4 = annotate_sex(mt, included_intervals=calling_intervals, sites_ht=freq_ht, f_stat_cutoff=-1, aaf_expr="AF", gt_expr="LGT", use_gaussian_mixture_model=True, variants_only_x_ploidy=True)

show(hl.plot.scatter(sex_ht.chrX_ploidy, sex_ht.chrY_ploidy, label=sex_ht.sex_karyotype))
show(hl.plot.scatter(sex_ht2.chrX_ploidy, sex_ht2.chrY_ploidy, label=sex_ht2.sex_karyotype))
show(hl.plot.scatter(sex_ht3.chrX_ploidy, sex_ht3.chrY_ploidy, label=sex_ht3.sex_karyotype))
show(hl.plot.scatter(sex_ht4.chrX_ploidy, sex_ht4.chrY_ploidy, label=sex_ht4.sex_karyotype))

Base automatically changed from jg/add_gaussian_karyotype to main September 16, 2022 14:56
gnomad/sample_qc/pipeline.py Show resolved Hide resolved
Comment on lines +463 to +466
"x_ploidy": "chrX_ploidy",
"y_ploidy": "chrY_ploidy",
"x_mean_dp": "chrX_mean_dp",
"y_mean_dp": "chrY_mean_dp",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this in general the names we want to use or it is based on the reference used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we used for UKBB and v3, so I was keeping the output consistent since this function was used in both of those repos

gnomad/sample_qc/pipeline.py Outdated Show resolved Hide resolved
gnomad/sample_qc/pipeline.py Show resolved Hide resolved
@jkgoodrich jkgoodrich requested a review from klaricch September 21, 2022 15:59
gnomad/sample_qc/pipeline.py Outdated Show resolved Hide resolved
gnomad/sample_qc/pipeline.py Outdated Show resolved Hide resolved
Co-authored-by: klaricch <kristen@broadinstitute.org>
@jkgoodrich jkgoodrich requested a review from klaricch September 21, 2022 17:09
Copy link
Contributor

@klaricch klaricch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jkgoodrich jkgoodrich merged commit d06f1f4 into main Sep 21, 2022
@jkgoodrich jkgoodrich deleted the jg/annotate_sex_variant_filter_options branch September 21, 2022 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants