Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore repeat subunit length re-computation for normalization #352

Merged
merged 4 commits into from
Feb 23, 2024

Conversation

ehclark
Copy link
Contributor

@ehclark ehclark commented Feb 22, 2024

See #351 for more details.

Recompute repeat subunit length to the greatest common denominator of the ref and alt sequence lengths when determining whether the allele state can be represented as a ReferenceLengthExpression.

Added an additional HGVS expression for the translator tests that verifies the fix is operating as expected.

@ehclark ehclark requested review from a team as code owners February 22, 2024 18:38
@ehclark ehclark linked an issue Feb 22, 2024 that may be closed by this pull request
Copy link
Contributor

@larrybabb larrybabb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good catch

@larrybabb larrybabb merged commit 2e3d2be into main Feb 23, 2024
8 checks passed
@larrybabb larrybabb deleted the issue-351 branch February 23, 2024 02:28
@ahwagner
Copy link
Member

Hey. Sorry for the delay in responding, have been in meetings all day.

@ehclark this is a good catch, thank you. I did add, and then remove, the gcd function out of concern for how that works in cases where the insertion happens in regions with non-integer repeats of the repeat subunit.

e.g.:

CAGCA      <- reference
CAGCAGCAGCA   <- with insertion

In this case, the extended ref length = 5 and the extended alt length = 11, and the gcd = 1; so this would end up (erroneously) as an LSE, when in fact this could be explained by insertion of two repeats of reference subunit of length 3.

Seeing that the reversion left open the door to issues like what you raised in #351, however, I have updated the code to capture all of these cases. I am going to restore this branch and reopen the PR with the additional commit, and an additional test that covers this type of event.

@ahwagner ahwagner restored the issue-351 branch February 23, 2024 05:58
@ahwagner ahwagner mentioned this pull request Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repeat subunit length not always correct
3 participants