-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate triggerables for all repeat instances where needed #586
Conversation
This was previously used to compute repeat relevance. However, TreeElement fully captures relevance rules. See getodk#581
Was always false so we can remove the branch
For performance reasons, this needs to only be done if we know the changed reference is a repeat. It's not possible to quickly identify whether a TreeReference represents a repeat/repeat instance so we need some information from createRepeatInstance. We could pass a boolean representing whether the changed node is a repeat but this seems clearer to me.
This illustrates the first case in getTriggerablesAffectingAllInstances.
We've captured the count contextualization issue elsewhere and don't have many predicate examples.
addingNestedRepeatInstance_updatesExpressionTriggeredByGenericRef_forAllRepeatInstances has a similar count expression
Repeat instance deletion was treated as a special case so didn't apply here. The back reference case is a regression.
This comment has been minimized.
This comment has been minimized.
JR makes a distinction between predicates and multiplicity but really multiplicity should just be a position predicate. It doesn't make sense to have both a multiplicity and a predicate at a particular reference step. If that would happen, favor the predicate.
Impressive work both on the implementation side and the effort for having every corner case characterized in tests! 👍 👍 👍 👍 👍 We will have a chat in a couple of minutes, but I want to write down some thoughts/comments:
I think it does, but only if we want to cover that corner case. It doesn't look like something you would find very often, so it could be OK to document it and ignore it for now.
I suggest we add a benchmark around that to keep it under control.
We've already established that the original implementation can't be explained. We even guess that it's an accidental implementation. I think that replacing the implementation is something that we can't really avoid. Now the question would be: how hard is going to be to add a new scenario we don't want know about? My opinion is that now we would be much better suited to change the DAG's behavior than in v2.17. This is not only because we've learned about the DAG's internals, but also because the code has a better design and there's a huge amount of tests that will let us work safely and with confidence.
Let's document this and provide form designers with alternative implementations that follow best practices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Awesome work!
Closes #584
Remaining questions:
initializeTriggerables
need to consider triggerables that affect all instances?addingRepeatInstance_withInnerCalculateDependentOnOuterSum_updatesInnerSumForAllInstances()
gets at something related.What has been done to verify that this works as intended?
Ensured that all tests that passed on v2.17.0 still pass. I analyzed the different types of triggerable cascades and wrote tests for them.
Why is this the best possible solution? Were any other approaches considered?
The alternative we have considered is bringing back the previous map (master...ggalmazor:firedAnchors_exploration). Neither @ggalmazor nor I can explain with confidence what cases that approach covered or why it was a good design. We know it didn't cover cases that we have tests for. Because the map both gets updated as triggerables are triggered and also feeds into which triggerables are triggered, it's hard to reason about. Additionally, it ends up bypassing the
expandReference
call inevaluateTriggerable
in certain cases where every repeat instance must be updated which is also very confusing.This PR proposes a different approach where the triggerables that need to be updated across repeat instances are identified on creation or removal of repeat. This makes it very clear that reference expansion is only relevant to repeats. It also makes it possible to have a single method (
getTriggerablesAffectingAllInstances
) that does all that work. We can then rely on a check on the set it produces when evaluating individual triggerables.I considered using the existing iteration over all the cascades triggered by the changed repeat in
getAllToTrigger
to identify triggerables that affect all repeat instance but I found that I couldn't efficiently look up whether the changedRef is a repeat instance so I would have had to pass that in. I am confident that the DAG traversal is not a major bottleneck so I went with the clearer approach.How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
The only intentional changes are bug fixes. There are three big types of risk: someone was relying on one of those bugs, there's a regression in behavior we don't have test coverage for, performance has degraded.
The big change in behavior compared to v2.17.0 is that adding a repeat instance will trigger recomputation of any calculation inside that repeat which refers to the generic repeat such as
count(../../repeat)
orposition(..)
. I don't think anyone could have been relying on that bad behavior. I think there is a performance implication forposition(..)
but I haven't had a chance to investigate yet.For performance, here is the
ChildVaccination
benchmark:this code
Benchmark Mode Cnt Score Error Units
ChildVaccinationBenchmark.run_1_times thrpt 25 1.405 ± 0.107 ops/s
ChildVaccinationBenchmark.run_2_times thrpt 25 0.685 ± 0.036 ops/s
ChildVaccinationBenchmark.run_3_times thrpt 25 0.472 ± 0.027 ops/s
v2.17.0 (last released)
ChildVaccinationBenchmark.run_1_times thrpt 25 1.121 ± 0.015 ops/s
ChildVaccinationBenchmark.run_2_times thrpt 25 0.564 ± 0.012 ops/s
ChildVaccinationBenchmark.run_3_times thrpt 25 0.372 ± 0.006 ops/s
master
Benchmark Mode Cnt Score Error Units
ChildVaccinationBenchmark.run_1_times thrpt 25 1.432 ± 0.179 ops/s
ChildVaccinationBenchmark.run_2_times thrpt 25 0.709 ± 0.101 ops/s
ChildVaccinationBenchmark.run_3_times thrpt 25 0.540 ± 0.104 ops/s
Guillermo firedAnchors revert
Benchmark Mode Cnt Score Error Units
ChildVaccinationBenchmark.run_1_times thrpt 25 1.217 ± 0.060 ops/s
ChildVaccinationBenchmark.run_2_times thrpt 25 0.625 ± 0.015 ops/s
ChildVaccinationBenchmark.run_3_times thrpt 25 0.413 ± 0.006 ops/s
This unambiguously is faster than v2.17.0 for the cases in ChildVaccination. It compares favorably to bringing back
firedAnchors
.One case I know will be slower is when there's a
position(..)
expression and a new repeat instance is added. Previously this would only be evaluated for the newly-created repeat. Now it will be evaluated for every repeat instance. This has a non-negligible impact when we get to hundreds of repeats and it would be more significant if there's a long cascade based on position. The tradeoff is that we don't need a special approach for deletes (where position does change for nodes after the deleted one) and that count/sum from inside the repeat updates correctly (modulo issues contextualizing absolute refs illustrated by tests). I think that tradeoff is worth it because I haven't seen many forms with hundreds of repeat instances or many forms that use position. Naturally, there are some that do both and that's one thing to get user feedback on. If it is a big problem, we can remove the first case ingetTriggerablesAffectingAllInstances
and match thev2.17.0
behavior.I think that the worst case would be that
firedAnchors
covers some case we haven't identified. Even then I'd say the possible user impact isn't huge because there are more forms without repeats than with repeats. We now have fairly significant test coverage so it seems unlikely that we've missed a common case.Do we need any specific form for testing your changes? If so, please attach one.
No. Tests capture the interesting cases that I can think of. Ideally a reviewer would do another pass thinking of cases I might have missed.
Does this change require updates to documentation? If so, please file an issue here and include the link below.
Yes, I will follow up with updated DAG documentation.