-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve collection types #523
Improve collection types #523
Conversation
Codecov Report
@@ Coverage Diff @@
## master #523 +/- ##
============================================
- Coverage 53.11% 52.87% -0.25%
+ Complexity 3127 3104 -23
============================================
Files 245 245
Lines 13374 13340 -34
Branches 2573 2566 -7
============================================
- Hits 7104 7053 -51
- Misses 5443 5467 +24
+ Partials 827 820 -7
Continue to review full report at Codecov.
|
Christmas is here and I know it's going to be hard to push this one forward, especially since it requires some degree of agreement about the correctness and completeness of the DAG test cases. Since I'd like to continue working on the codebase on top of this PR, It would be great to at least have a sanity check to the code changes themselves, focusing on how memory usage has changed compared to the current implementation that reuses objects by passing them as input args to methods and mutating them (side-effects) inside them. If we determine that now JR would use more memory than before, it would be great to find a way to maintain the immutable & side-effect free implementation in this PR, and, only if this is not possible, change it back to a side-effect based, object reusing style. @dcbriccetti, I was hoping that you could help us with this ;) |
I've rebased the branch of this PR on top of #526 to make it more concise. I've also taken the chance to divide some commits into smaller chunks to make them easier to review and give more context. |
I've also checked that no object cloning is done during the DAG build process, which would support our current understanding that the immutable, side-effect free code in this PR shouldn't increase noticeably memory usage. Looking forward @dcbriccetti's take on this ;) |
This removes the need for duplicate checking and makes explicit that this set is dependant on (insertion) order.
We need the code to be easier to understand when it comes to recursive algorithms. Mutation of input arguments makes it super hard, and this commit tries to solve that by making those methods return something instead.
It turns out that the newDestinationSet variable was redundant and it looks like the real output of the method is the deps set (to be removed in the next commit)
Remove the output arg. The block where elements get added can be further simplified using Map.getOrDefault or the alternative code that is compatible with our target Android API level (next commit)
Simplify adding elements into the result set
Now the algorithm is easier to understand with no overhead (same number of iterations in the inner loop)
We want to extract some methods to structure the process a bit more
There's no need for clearing the collection if we're reassigning it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive work.
Overall, I am convinced that this is safe and that the changes are easy to follow. After our conversation about the impact of creating additional collections at each recursive step, I am convinced that the memory impact will be minimal. The garbage collector will presumably be working harder but hopefully that wouldn't have much impact even on underpowered devices.
There are two commits I'd like to spend a little more time with when I have a fresh head: 73e6364 and 86e2745. I still wanted to get you some comments in the mean time.
As a result, some methods can be made static. Also review visibility and set it to private where possible
I reworded the commit messages to fit the subject under 72 chars and I didn't realize that the links to commits you were hoping to study better would change as a result. Sorry, @lognaturel! 73e6364 is still there, but the one about avoiding to hit the main instance is in 2bd11f6 now |
The inlined method was confusing and misguiding because the nested loop didn't make much sense, and because the naming suggested that the method would return all descendants recursively, which neither was needed or actually happening. Then, the naming wasn't helping to understand what was going on either. By using a language closer to DAG terminology, I hope it's easier now to understand what the buildDagEdges method does.
We need a test that explores deeply nested computations and repeat groups before 2bd11f6 |
If source can't cascade, then the targets set will be empty.
Also improve doc blocks
Avoid confusion by contextualizing the word "target" in the comment, which can be related to the DAG edge target and the triggerable target tree reference.
The DAG edge is a set because there can't be duplicates.
I think I've gone through your latest comments in the PR and now I'm trying to focus exclusively on the changes regarding the mutually exclusive recursive methods to get all descendant refs from 73e6364 and 2bd11f6 In order to start from a safe place, I'm reverting the related changes from 73e6364 (restore Then, I've evaluated the move of those methods from This should make it safe to replace the descendant target computation from With this setup, my plan is to dissect |
I've been playing with this test to understand how the check in FormInstanceParse.getDescendantRefs works: @Test
public void name2() {
TreeElement root = new TreeElement("data");
TreeElement group = new TreeElement("group");
TreeElement field = new TreeElement("field");
root.addChild(group);
group.addChild(field);
FormInstance formInstance = new FormInstance(root);
TreeReference ref = TreeReference.rootRef()
.extendRef("data", 0)
.extendRef("group", 0);
TreeElement templatePath = formInstance.getTemplatePath(ref);
Set<TreeReference> descendantRefs = FormInstanceParser.getDescendantRefs(formInstance, new EvaluationContext(formInstance), ref);
System.out.println(descendantRefs);
} The breakthrough comes when declaring the Unfortunately, there's no way the parser would reproduce this scenario when parsing a form's XML document. We can't describe in XML a group at multiplicity These findings confirm our theory that the |
Note that this method was originally declared in TriggerableDag
The methods are recursive, and will return all descendants, not only children
I have a couple of new questions above. Other than that, I have completed my review. It looks like a lot of things can now be private in |
It doesn't make sense to have protected members because now now one's extending this class.
Thanks, @ggalmazor! Have you had a chance to compare benchmarks from before and after this change? It seems like there could be some positive performance implications that come with some of the simplifications and O(1) collection changes. After more interactive review, I feel more confident that it's unlikely that this will have significant negative memory implications. So @dcbriccetti, I think we're ok without additional review on this one. Thanks! |
Closes #458
This PR is built on top of #526 to bring a review of the collection types involved in the DAG. The first commit is
Migrate Triggerable.targets to Set
5d20c1aThe main goal would be to use the most appropriate collection type for the task in hand e.g. using sets when we don't want to have duplicates, and linked list supported collections when the order of elements is important.
Important topics:
Makes part of the DAG building algorithm immutable and side-effect free to make it easier to understand.
It's important that we understand what are the memory use implications and decide whether we want to undo some of these side-effects as a tradeoff for consuming less memory.
We could also study ways to keep everything easy to understand (immutable & side-effect free) while programming in a way that the garbage collector helps keep memory use in line.
Refactors the DAG building process to take advantage of specific collection types
The key point here is that we're able to keep the insertion order and avoid having to sort the DAG, which would bring performance improvements.
What has been done to verify that this works as intended?
Added automated tests
Why is this the best possible solution? Were any other approaches considered?
I think this question doesn't apply. Maybe reviewers can make specific questions.
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
This PR should have no behavior changes, although due to the changes to the DAG building code, some change propagation events might appear in a different order. This shouldn't be a problem because triggerable chain is always guaranteed to happen in the right order according to whether they have dependencies or not.
Do we need any specific form for testing your changes? If so, please attach one.
No.
Does this change require updates to documentation? If so, please file an issue here and include the link below.
No.