Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More refactoring PDHCE and preparing for joint detection #8492

Merged
merged 21 commits into from
Sep 21, 2023

Conversation

davidbenjamin
Copy link
Contributor

@davidbenjamin davidbenjamin commented Aug 24, 2023

@jamesemery here's another one for you.

@davidbenjamin
Copy link
Contributor Author

@jamesemery Looks like I got that "glorious commit" to work after all. Now the second-to-last commit will take some scrutiny; the rest are straightforward.

@jamesemery
Copy link
Collaborator

@davidbenjamin see #8467 where i guess i reviewed this already? Is this substantially different?

@davidbenjamin
Copy link
Contributor Author

@jamesemery Okay, we'ce got some really good tests for EventGroup clustering and branching now. It's ready for you to take another look.

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor oversight in one of the added tests and then some nothing comments. Feel free to merge this when you are satisfied that you've responded to everything. I would recommend that you go through the exercise of testing this in the FE framework before merging it just to be really sure that there is nothing that is going to break on the full dataset.

// TODO: this needs to generalize to a set of determined events or empty if ref is determined
this.determinedEvents = determinedEvents;

final int minStart = allEventsAtDeterminedLocus.stream().mapToInt(Event::getStart).min().orElse(determinedPosition);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a warning for the next stage to be careful about this part of the code... the PDHMM has a few optimizations in it that make this change a little dangerous, first it doesn't actually run for reads that do not overlap the determined event in the PDhaplotype (which means that the arrays might have unpopulated events in them that you REALLY do not want to include in the math anywhere). In theory you can simply extend this to run the genotyper for any event that overlaps any position in the determined extent so this should probably not cause issue but be careful.

* @return
*/
public List<Set<Event>> setsForBranching(final List<Event> locusEvents, final Set<Event> determinedEvents, final boolean disallowSubsets) {
public List<Set<Event>> eventSetsForPDHaplotypes(final Set<Event> determinedEvents, final List<Event> locusEvents) {
final SmallBitSet locusOverlapSet = overlapSet(locusEvents);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider overlapSet -> intersectVsEventGroup to make it a little more legible what is happening here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or perhaps just a comment

@gatk-bot
Copy link

Github actions tests reported job failures from actions build 6243495413
Failures in the following jobs:

Test Type JDK Job ID Logs
variantcalling 17.0.6+10 6243495413.2 logs

@davidbenjamin
Copy link
Contributor Author

Functional equivalence results are unchanged, as they should be. I will merge now.

@davidbenjamin davidbenjamin merged commit 70e047d into master Sep 21, 2023
20 checks passed
@davidbenjamin davidbenjamin deleted the db_pdhce_8_23 branch September 21, 2023 16:05
rickymagner pushed a commit that referenced this pull request Nov 28, 2023
* streamlining EventGroup code

* PartiallyDeterminedHaplotype can handle multiple determined events

* Branching in terms of included events, not excluded events

* unit tests for event group clustering and branching
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants