v0.7.0
This release contained a function that required Hail >= 0.2.126. Please use a newer release
What's Changed
Breaking Changes
- Update some gnomAD resources from lists to version dictionaries by @mike-w-wilson in #522
- Modifications to
annotate_freq
to improve memory use by @jkgoodrich in #577
Bug fixes
- Add
get_slope_int_relationship_expr
to get relationship between a pair of samples given slope and intercepts of lines to use as cutoffs. by @jkgoodrich in #511 - Fix access to version's SUBSETS and POPS within repo by @mike-w-wilson in #529
- Small changes to bokeh module imports in
utils.plotting
that were failing with Hail update by @jkgoodrich in #540 - Fix
filter_x_nonpar
andfilter_y_nonpar
to use reference genome by @jkgoodrich in #553 - Fix callstats order in
merge_freq_arrays
by @jkgoodrich in #574 - Avoid DeprecationWarnings from superseded hail function and import [minor] by @jmarshall in #576
- Fix
merge_freq_arrays
for cases with more than two arrays by @jkgoodrich in #587 - Fix negative values issue with 'diff' by @KoalaQin in #590
- Fix ValueError for
count_arrays
inmerge_freq_arrays
function by @KoalaQin in #591 - Modify
apply_rf_model
to usevector_to_array
frompyspark.ml.functions
instead ofudf
by @matren395 in #592 - Fix to drop 'AS_SB' after converting to 'AS_SB_TABLE' in
get_as_info_expr
by @jkgoodrich in #602 - Fix to GKS Seqloc
new_temp_file
by @matren395 in #612 - Move ga4gh imports to their functions by @mike-w-wilson in #626
New Features
- Add generic constraint function
annotate_constraint_groupings()
by @averywpx in #497 - Add an option for samples that must be kept to
compute_related_samples_to_drop
by @jkgoodrich in #506 - Add
determine_nearest_neighbors
to find nearest neighbors for each sample. Modifycompute_stratified_metrics_filter
to work with acomparison_sample_expr
that specifies what samples to compare to for filtering, this works well with the output ofdetermine_nearest_neighbor
. by @jkgoodrich in #509 - Add utility function to repartition HTs prior to join by @ch-kr in #512
- Add VEP 105 init script and its docker image by @KoalaQin in #516
- Add VEP 105 GRCh38 context HT resource by @jkgoodrich in #524
- Add additional groupings to optional stratified allele frequencies by @KoalaQin in #523
- Add 'strata' and 'qc_metrics' as globals on the table returned by
compute_stratified_metrics_filter
by @jkgoodrich in #521 - Modify
annotate_mutation_type
to take optional context length as a parameter. by @jkgoodrich in #530 - Add generic constraint functions:
oe_aggregation_expr()
,compute_pli()
,oe_confidence_interval()
,calculate_raw_z_score()
,calculate_raw_z_score_sd()
by @averywpx in #505 - Add dbSNP b156 to resources for v4 by @KoalaQin in #525
- Add
pab_max_expr
function and modifydefault_compute_info
to add 'AS_pab_max' annotation by @jkgoodrich in #531 - Add generic constraint functions:
get_downsamplings()
,remove_coverage_outliers()
, andfilter_for_mu()
by @averywpx in #507 - Add
ac_filter_groups
todefault_compute_info
allowing additional allele count groupings by @jkgoodrich in #534 - Add global annotations for 'vep_version', 'vep_help', and 'vep_config ' to the returned Table in
vep_or_lookup_vep
by @jkgoodrich in #536 - Add
annotate_allele_info
function toutils.annotations
by @jkgoodrich in #535 - Add validity check code of VEP annotations in protein-coding genes by @KoalaQin in #548
- Merge freq array function and new frequency dictionary builder by @mike-w-wilson in #551
- Add GRCh38 methylation sites resource by @jkgoodrich in #552
- Modify
comparison_sample_expr
parameter ofcompute_stratified_metrics_filter
to also accept a BooleanExpression by @jkgoodrich in #557 - Add parameters
apply_model_func
andconvert_model_func
toassign_population_pcs
so it has the ability to work with other models types by @jkgoodrich in #558 - Add
sample_list_stratification
option tocreate_fake_pedigree
function by @jkgoodrich in #564 - Modify
default_compute_info
with the option to use theAS_
annotations in gvcf_info for allele specific aggregations by @jkgoodrich in #560 - Modify
annotate_adj
to support LGT and LAD by @jkgoodrich in #567 - Function to annotate downsamplings onto HT/MT by @mike-w-wilson in #570
- Add function to merge histograms with the same bin_edges by @mike-w-wilson in #572
- Add option to also merge an array of counts/ints in the freq array merge by @mike-w-wilson in #565
- Update
annotate_freq
andqual_hists
, addsplit_vds
andcompute_freq_by_strata
by @mike-w-wilson in #571 - Add function
update_structured_annotations
to update structured annotations on a Table by @KoalaQin in #580 - Make naive_coalesce optional in
default_compute_info
by @jkgoodrich in #584 - Add function to remove items from freq and freq_meta by @KoalaQin in #582
- Add a
select_fields
option tocompute_freq_by_strata
by @jkgoodrich in #595 - Modify
split_info_annotation
to allow for splitting an info expression that doesn't includeAS_SB_TABLE
by @jkgoodrich in #594 - Update to allow for grouping and filtering by MANE transcripts by @klaricch in #605
- Add gnomad_gks() and get_gks() for extracting gks information for a specified variant by @matren395 in #596
- Add aggregations to variant QC evaluation for additional plots by @jkgoodrich in #609
- Add function to get max FAF from
faf_expr
by @KoalaQin in #608 - Add optional stratification parameter to coverage by @jkgoodrich in #615
- Add methylation resource for chrX by @klaricch in #622
- Add pop_label option to
pop_max_expr
,faf_expr
, andgen_anc_faf_max_expr
by @jkgoodrich in #623 - Add
apply_keep_to_only_items_in_filter
option tofilter_arrays_by_meta
by @jkgoodrich in #624 - Add pprint globals and a global/row length comparison, updates monoallelic expr in validity checks by @mike-w-wilson in #630
- Add MANE Select filtering option to
get_summary_counts
by @jkgoodrich in #634 - Add optional parameters to
set_female_y_metrics_to_na_expr
to use other frequency fields by @jkgoodrich in #635 - Update resource paths by @klaricch in #642
Other Changes
- Update doc requirements.doc.txt by @jkgoodrich in #520
- Bump requests from 2.28.2 to 2.31.0 in /docs by @dependabot in #543
- Add VEP 105 CSQ FIELDs by @KoalaQin in #546
- Update python 3.8 -> 3.11 by @jkgoodrich in #578
- Add ability to retrieve max for any threshold in faf by @mike-w-wilson in #616
- Remove inadvertent tuple in popMaxFAF95 field in
add_gks_va
function by @mattsolo1 in #621 - Update requirements files by @jkgoodrich in #632
- Update HGDP pops by @KoalaQin in #631
- Revert tuple type in
build_models
by @klaricch in #638 - Check for
skip_coverage_model
is False in build_models by @klaricch in #639
New Contributors
- @KoalaQin made their first contribution in #516
- @jmarshall made their first contribution in #576
- @mattsolo1 made their first contribution in #621
Full Changelog: v0.6.4...v0.7.0