Skip to content

Commit

Permalink
Restore repeat subunit length re-computation for normalization (#352)
Browse files Browse the repository at this point in the history
* Restore line that recomputes repeat subunit length

* Added additional HGVS test case

* Only recompute subunit length for ambiguous insertions or deletion/insertions

* Added test case with ambiguous deletion
  • Loading branch information
ehclark authored Feb 23, 2024
1 parent 64fee4c commit 2e3d2be
Show file tree
Hide file tree
Showing 7 changed files with 1,189 additions and 3 deletions.
1 change: 1 addition & 0 deletions src/ga4gh/vrs/normalize.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,7 @@ def _normalize_allele(input_allele, data_proxy, rle_seq_limit=50):
len_extended_ref = len(extended_ref_seq)

if len_extended_alt > len_extended_ref:
repeat_subunit_length = math.gcd(len_extended_ref, len_extended_alt)
repeat_sequence = itertools.cycle(extended_ref_seq[:repeat_subunit_length])
ref_derived_alt = ''.join([next(repeat_sequence) for _ in range(len_extended_alt)])
# TODO: The space and time efficiency can be improved by iterating over the new_allele[1]
Expand Down
Loading

0 comments on commit 2e3d2be

Please sign in to comment.