Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add apply_keep_to_only_items_in_filter option to filter_arrays_by_meta #624

Merged
merged 8 commits into from
Oct 13, 2023

Conversation

jkgoodrich
Copy link
Contributor

I have a case where I want to filter my meta_freq to

[{'group': 'adj'}, {'group': 'raw'}, {'gen_anc': 'afr', 'group': 'adj'}, {'gen_anc': 'amr', 'group': 'adj'}, {'gen_anc': 'asj', 'group': 'adj'}, {'gen_anc': 'eas', 'group': 'adj'}, {'gen_anc': 'fin', 'group': 'adj'}, {'gen_anc': 'mid', 'group': 'adj'}, {'gen_anc': 'nfe', 'group': 'adj'}, {'gen_anc': 'remaining', 'group': 'adj'}, {'gen_anc': 'sas', 'group': 'adj'}, {'group': 'adj', 'sex': 'XX'}, {'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'afr', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'afr', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'amr', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'amr', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'asj', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'asj', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'eas', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'eas', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'fin', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'fin', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'mid', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'mid', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'nfe', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'nfe', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'remaining', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'remaining', 'group': 'adj', 'sex': 'XY'}, {'gen_anc': 'sas', 'group': 'adj', 'sex': 'XX'}, {'gen_anc': 'sas', 'group': 'adj', 'sex': 'XY'}]

removing all downsamplings and subsets, but by indicating the groups I want to keep rather than those I want to remove.

    freq_meta, array_exprs = filter_arrays_by_meta(
        ht.freq_meta,
        {
            "freq": ht.freq,
            "freq_meta_sample_count": ht.index_globals().freq_meta_sample_count,
        },
        ["group", "gen_anc", "sex"],
        combine_operator="or",
        apply_keep_to_only_items_in_filter=True,
    )

I have no clue what to name this though, so feel free to suggest other names.

The `apply_keep_to_only_items_in_filter` parameter can be used to apply the `keep`
parameter to only the items specified in the `items_to_filter` parameter. For
example, by default, if:
    - `keep` is True
    - `combine_operator` is "and"
    - `items_to_filter` is ["sex", "downsampling"]
then all items in `meta_expr` with both "sex" and "downsampling" as keys will be
kept. However, if `apply_keep_to_only_items_in_filter` is True, then the items
in `meta_expr` will only be kept if "sex" and "downsampling" are the only keys in
the meta dict.

Copy link
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good -- a couple ideas on the arg naming

gnomad/utils/filtering.py Outdated Show resolved Hide resolved
gnomad/utils/filtering.py Outdated Show resolved Hide resolved
gnomad/utils/filtering.py Show resolved Hide resolved
jkgoodrich and others added 2 commits October 13, 2023 10:33
Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
gnomad/utils/filtering.py Outdated Show resolved Hide resolved
gnomad/utils/filtering.py Show resolved Hide resolved
Copy link
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jkgoodrich jkgoodrich merged commit 107b59b into main Oct 13, 2023
3 checks passed
@jkgoodrich jkgoodrich deleted the jg/add_option_to_filter_arrays_by_meta branch October 13, 2023 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants