Improve collection types #523

ggalmazor · 2019-12-18T12:02:01Z

Closes #458

This PR is built on top of #526 to bring a review of the collection types involved in the DAG. The first commit is Migrate Triggerable.targets to Set 5d20c1a

The main goal would be to use the most appropriate collection type for the task in hand e.g. using sets when we don't want to have duplicates, and linked list supported collections when the order of elements is important.

Important topics:

Makes part of the DAG building algorithm immutable and side-effect free to make it easier to understand.

It's important that we understand what are the memory use implications and decide whether we want to undo some of these side-effects as a tradeoff for consuming less memory.

We could also study ways to keep everything easy to understand (immutable & side-effect free) while programming in a way that the garbage collector helps keep memory use in line.
Refactors the DAG building process to take advantage of specific collection types

The key point here is that we're able to keep the insertion order and avoid having to sort the DAG, which would bring performance improvements.

What has been done to verify that this works as intended?

Added automated tests

Why is this the best possible solution? Were any other approaches considered?

I think this question doesn't apply. Maybe reviewers can make specific questions.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

This PR should have no behavior changes, although due to the changes to the DAG building code, some change propagation events might appear in a different order. This shouldn't be a problem because triggerable chain is always guaranteed to happen in the right order according to whether they have dependencies or not.

Do we need any specific form for testing your changes? If so, please attach one.

No.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

codecov-io · 2019-12-20T08:08:06Z

Codecov Report

Merging #523 into master will decrease coverage by 0.24%.
The diff coverage is 87.87%.

@@             Coverage Diff              @@
##             master     #523      +/-   ##
============================================
- Coverage     53.11%   52.87%   -0.25%     
+ Complexity     3127     3104      -23     
============================================
  Files           245      245              
  Lines         13374    13340      -34     
  Branches       2573     2566       -7     
============================================
- Hits           7104     7053      -51     
- Misses         5443     5467      +24     
+ Partials        827      820       -7

Impacted Files	Coverage Δ	Complexity Δ
...org/javarosa/core/model/condition/Recalculate.java	`70% <ø> (-10%)`	`9 <0> (-2)`
...a/org/javarosa/core/model/condition/Condition.java	`32.43% <ø> (-5.41%)`	`9 <0> (-2)`
...a/xform/parse/StandardBindAttributesProcessor.java	`80.21% <100%> (-1.6%)`	`25 <0> (ø)`
src/main/java/org/javarosa/core/model/FormDef.java	`69.65% <100%> (ø)`	`155 <1> (ø)`	⬇️
...java/org/javarosa/core/model/QuickTriggerable.java	`70.37% <100%> (-18.52%)`	`14 <1> (-6)`
...org/javarosa/core/model/condition/Triggerable.java	`81.25% <100%> (-1.47%)`	`22 <4> (-2)`
...a/org/javarosa/xform/parse/FormInstanceParser.java	`70.94% <82.14%> (-1.1%)`	`114 <4> (+2)`
...n/java/org/javarosa/core/model/TriggerableDag.java	`71.81% <90.62%> (+0.57%)`	`74 <26> (-6)`	⬇️
...avarosa/core/model/QuickTriggerableComparator.java	`0% <0%> (-73.34%)`	`0% <0%> (-5%)`
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02fb9e0...ac646eb. Read the comment docs.

ggalmazor · 2019-12-21T08:10:22Z

Christmas is here and I know it's going to be hard to push this one forward, especially since it requires some degree of agreement about the correctness and completeness of the DAG test cases.

Since I'd like to continue working on the codebase on top of this PR, It would be great to at least have a sanity check to the code changes themselves, focusing on how memory usage has changed compared to the current implementation that reuses objects by passing them as input args to methods and mutating them (side-effects) inside them.

If we determine that now JR would use more memory than before, it would be great to find a way to maintain the immutable & side-effect free implementation in this PR, and, only if this is not possible, change it back to a side-effect based, object reusing style.

@dcbriccetti, I was hoping that you could help us with this ;)

ggalmazor · 2020-01-08T12:13:59Z

I've rebased the branch of this PR on top of #526 to make it more concise. I've also taken the chance to divide some commits into smaller chunks to make them easier to review and give more context.

ggalmazor · 2020-01-08T12:19:27Z

I've also checked that no object cloning is done during the DAG build process, which would support our current understanding that the immutable, side-effect free code in this PR shouldn't increase noticeably memory usage.

Looking forward @dcbriccetti's take on this ;)

This removes the need for duplicate checking and makes explicit that this set is dependant on (insertion) order.

We need the code to be easier to understand when it comes to recursive algorithms. Mutation of input arguments makes it super hard, and this commit tries to solve that by making those methods return something instead.

It turns out that the newDestinationSet variable was redundant and it looks like the real output of the method is the deps set (to be removed in the next commit)

Remove the output arg. The block where elements get added can be further simplified using Map.getOrDefault or the alternative code that is compatible with our target Android API level (next commit)

Simplify adding elements into the result set

Now the algorithm is easier to understand with no overhead (same number of iterations in the inner loop)

We want to extract some methods to structure the process a bit more

There's no need for clearing the collection if we're reassigning it

lognaturel

Impressive work.

Overall, I am convinced that this is safe and that the changes are easy to follow. After our conversation about the impact of creating additional collections at each recursive step, I am convinced that the memory impact will be minimal. The garbage collector will presumably be working harder but hopefully that wouldn't have much impact even on underpowered devices.

There are two commits I'd like to spend a little more time with when I have a fresh head: 73e6364 and 86e2745. I still wanted to get you some comments in the mean time.

src/main/java/org/javarosa/core/model/TriggerableDag.java

src/main/java/org/javarosa/core/model/QuickTriggerable.java

src/main/java/org/javarosa/core/model/TriggerableDag.java

src/main/java/org/javarosa/xform/parse/FormInstanceParser.java

src/main/java/org/javarosa/core/model/TriggerableDag.java

As a result, some methods can be made static. Also review visibility and set it to private where possible

ggalmazor · 2020-01-09T10:10:02Z

I reworded the commit messages to fit the subject under 72 chars and I didn't realize that the links to commits you were hoping to study better would change as a result. Sorry, @lognaturel!

73e6364 is still there, but the one about avoiding to hit the main instance is in 2bd11f6 now

src/main/java/org/javarosa/core/model/TriggerableDag.java

The inlined method was confusing and misguiding because the nested loop didn't make much sense, and because the naming suggested that the method would return all descendants recursively, which neither was needed or actually happening. Then, the naming wasn't helping to understand what was going on either. By using a language closer to DAG terminology, I hope it's easier now to understand what the buildDagEdges method does.

src/main/java/org/javarosa/core/model/TriggerableDag.java

ggalmazor · 2020-01-10T17:10:32Z

We need a test that explores deeply nested computations and repeat groups before 2bd11f6

src/main/java/org/javarosa/core/model/TriggerableDag.java

If source can't cascade, then the targets set will be empty.

Also improve doc blocks

Avoid confusion by contextualizing the word "target" in the comment, which can be related to the DAG edge target and the triggerable target tree reference.

The DAG edge is a set because there can't be duplicates.

ggalmazor · 2020-01-13T09:38:14Z

I think I've gone through your latest comments in the PR and now I'm trying to focus exclusively on the changes regarding the mutually exclusive recursive methods to get all descendant refs from 73e6364 and 2bd11f6

In order to start from a safe place, I'm reverting the related changes from 73e6364 (restore getChildrenOfReference, and getChildrenRefsOfElement from 73e6364)

Then, I've evaluated the move of those methods from TriggerablesDag to FormInstanceParser, and I've assessed that it's safe to keep it because we can establish that they will only get called when the triggerable can cascade which, in turn, it's only true for groups with a relevance condition. The if (triggerable.isCascadingToChildren()) made sure about that in the original code at TriggerablesDag.getDependantTriggerables

This should make it safe to replace the descendant target computation from TriggerablesDag.getDependantTriggerables to FormInstanceParser.applyInstanceProperties. If you're not confident about this, we should make it a target for new tests.

With this setup, my plan is to dissect FormInstance.getTemplatePath and write tests that exercise both forks of the if block at FormInstanceParser.getDescendantRefs.

ggalmazor · 2020-01-14T10:31:04Z

I've been playing with this test to understand how the check in FormInstanceParse.getDescendantRefs works:

@Test
public void name2() {
    TreeElement root = new TreeElement("data");
    TreeElement group = new TreeElement("group");
    TreeElement field = new TreeElement("field");
    root.addChild(group);
    group.addChild(field);
    FormInstance formInstance = new FormInstance(root);

    TreeReference ref = TreeReference.rootRef()
        .extendRef("data", 0)
        .extendRef("group", 0);
    TreeElement templatePath = formInstance.getTemplatePath(ref);

    Set<TreeReference> descendantRefs = FormInstanceParser.getDescendantRefs(formInstance, new EvaluationContext(formInstance), ref);
    System.out.println(descendantRefs);
}

The breakthrough comes when declaring the group element to have a multiplicity to 1, which forces the getDescendantRefs method to follow the else path in the check, which uses the EvaluationContext to expand the tree reference and get the corresponding elements.

Unfortunately, there's no way the parser would reproduce this scenario when parsing a form's XML document. We can't describe in XML a group at multiplicity 1 without declaring another group at multiplicity 0. This is key because the check at getDescendantRefs will revert to the zeroth group when resolving the mainInstance.getTemplatePath(original); if no group with jr:template="" is declared.

These findings confirm our theory that the else branch is not really needed because, even though we can artificially exercise it from our tests by manipulating the contents of the instance, no form can actually recreate that scenario.

Note that this method was originally declared in TriggerableDag

The methods are recursive, and will return all descendants, not only children

src/main/java/org/javarosa/core/model/TriggerableDag.java

src/main/java/org/javarosa/xform/parse/FormInstanceParser.java

lognaturel · 2020-01-16T17:55:10Z

I have a couple of new questions above. Other than that, I have completed my review. It looks like a lot of things can now be private in TriggerableDag. Worth doing here?

It doesn't make sense to have protected members because now now one's extending this class.

lognaturel · 2020-01-16T19:52:59Z

Thanks, @ggalmazor! Have you had a chance to compare benchmarks from before and after this change? It seems like there could be some positive performance implications that come with some of the simplifications and O(1) collection changes.

After more interactive review, I feel more confident that it's unlikely that this will have significant negative memory implications. So @dcbriccetti, I think we're ok without additional review on this one. Thanks!

ggalmazor requested a review from lognaturel December 18, 2019 12:04

ggalmazor marked this pull request as ready for review December 20, 2019 07:58

This was referenced Dec 20, 2019

Issue 524 remove redundant dag cycle check #526

Merged

Issue 522 triggerables triggered twice on new repeat #525

Merged

ggalmazor requested a review from dcbriccetti December 21, 2019 08:03

ggalmazor mentioned this pull request Jan 8, 2020

Improve getting dependant triggerables #530

Closed

ggalmazor added 16 commits January 8, 2020 18:59

Migrate Triggerable.targets to Set

04bf1bd

Migrate TriggerableDag.triggerablesDAG to a LinkedHashSet

ed1bae4

This removes the need for duplicate checking and makes explicit that this set is dependant on (insertion) order.

Migrate TriggerableDag.triggerIndex to a map of sets

beb274b

Migrate TriggerableDag.unorderedTriggers to a Set

de6efcf

Migrate to immutable and side-effect free code

883ec3b

We need the code to be easier to understand when it comes to recursive algorithms. Mutation of input arguments makes it super hard, and this commit tries to solve that by making those methods return something instead.

Migrate to immutable and side-effect free code - Phase 2

9b0f5a5

It turns out that the newDestinationSet variable was redundant and it looks like the real output of the method is the deps set (to be removed in the next commit)

Migrate to immutable and side-effect free code - Phase 3

4a75184

Remove the output arg. The block where elements get added can be further simplified using Map.getOrDefault or the alternative code that is compatible with our target Android API level (next commit)

Migrate to immutable and side-effect free code - Phase 4

e671744

Simplify adding elements into the result set

Simple rearrange

acd8d54

Add TODO note

49a596b

Improve naming and extract methods to reduce duplication of concepts

73e6364

Extract inner loop in cascade outside of the main loop

eaa0f81

Now the algorithm is easier to understand with no overhead (same number of iterations in the inner loop)

Inline for more concise code

1f7d147

Separate computing the DAG from assigning it

aaf6114

We want to extract some methods to structure the process a bit more

Extract method that builds the DAG

0325420

Simplify assigning a new DAG

d996178

There's no need for clearing the collection if we're reassigning it

lognaturel reviewed Jan 9, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Show resolved Hide resolved

lognaturel reviewed Jan 9, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Outdated Show resolved Hide resolved

Pull up computing the edges to avoid passing through dependencies

ccc0087

As a result, some methods can be made static. Also review visibility and set it to private where possible

lognaturel reviewed Jan 9, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Show resolved Hide resolved

ggalmazor added 2 commits January 10, 2020 11:42

Add doc blocks for TriggerableDag members. Wrap at 80 chars

150a469

lognaturel reviewed Jan 10, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Outdated Show resolved Hide resolved

lognaturel reviewed Jan 10, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Outdated Show resolved Hide resolved

lognaturel reviewed Jan 10, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Outdated Show resolved Hide resolved

lognaturel reviewed Jan 10, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Outdated Show resolved Hide resolved

ggalmazor added 6 commits January 13, 2020 09:49

Remove redundant check

11537bf

If source can't cascade, then the targets set will be empty.

Improve naming for less confusion

6e9aba9

Make the getDagEdges method non-static to avoid confusion by arg names

1f31f7d

Also improve doc blocks

Improve comment by using unambiguous language

a942f47

Avoid confusion by contextualizing the word "target" in the comment, which can be related to the DAG edge target and the triggerable target tree reference.

Change the collection type to set for conformity

3cfecc1

The DAG edge is a set because there can't be duplicates.

Add end-to-end test to verify that the DAG is sorted

f764f82

ggalmazor added 2 commits January 14, 2020 18:08

Revert getChidlrenOfReference method to the original implementation

0a25712

Note that this method was originally declared in TriggerableDag

Rename to avoid misguiding use of "children"

ac646eb

The methods are recursive, and will return all descendants, not only children

lognaturel reviewed Jan 15, 2020

View reviewed changes

src/main/java/org/javarosa/core/model/TriggerableDag.java Show resolved Hide resolved

lognaturel reviewed Jan 16, 2020

View reviewed changes

src/main/java/org/javarosa/xform/parse/FormInstanceParser.java Outdated Show resolved Hide resolved

lognaturel reviewed Jan 16, 2020

View reviewed changes

src/main/java/org/javarosa/xform/parse/FormInstanceParser.java Show resolved Hide resolved

ggalmazor added 3 commits January 16, 2020 20:38

Remove unnecessary comment

757ea20

Add TO DO notes to remember code exploration insights and questions

0fa1549

Make private all the protected members

a3b7a18

It doesn't make sense to have protected members because now now one's extending this class.

lognaturel approved these changes Jan 16, 2020

View reviewed changes

lognaturel merged commit bf184a7 into getodk:master Jan 16, 2020

ggalmazor deleted the improve_collection_types branch February 5, 2020 08:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve collection types #523

Improve collection types #523

ggalmazor commented Dec 18, 2019 •

edited

Loading

codecov-io commented Dec 20, 2019 •

edited

Loading

ggalmazor commented Dec 21, 2019 •

edited

Loading

ggalmazor commented Jan 8, 2020 •

edited

Loading

ggalmazor commented Jan 8, 2020 •

edited

Loading

lognaturel left a comment •

edited

Loading

ggalmazor commented Jan 9, 2020 •

edited

Loading

ggalmazor commented Jan 10, 2020

ggalmazor commented Jan 13, 2020 •

edited

Loading

ggalmazor commented Jan 14, 2020 •

edited

Loading

lognaturel commented Jan 16, 2020

lognaturel commented Jan 16, 2020

Improve collection types #523

Improve collection types #523

Conversation

ggalmazor commented Dec 18, 2019 • edited Loading

What has been done to verify that this works as intended?

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

Do we need any specific form for testing your changes? If so, please attach one.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

codecov-io commented Dec 20, 2019 • edited Loading

Codecov Report

ggalmazor commented Dec 21, 2019 • edited Loading

ggalmazor commented Jan 8, 2020 • edited Loading

ggalmazor commented Jan 8, 2020 • edited Loading

lognaturel left a comment • edited Loading

Choose a reason for hiding this comment

ggalmazor commented Jan 9, 2020 • edited Loading

ggalmazor commented Jan 10, 2020

ggalmazor commented Jan 13, 2020 • edited Loading

ggalmazor commented Jan 14, 2020 • edited Loading

lognaturel commented Jan 16, 2020

lognaturel commented Jan 16, 2020

ggalmazor commented Dec 18, 2019 •

edited

Loading

codecov-io commented Dec 20, 2019 •

edited

Loading

ggalmazor commented Dec 21, 2019 •

edited

Loading

ggalmazor commented Jan 8, 2020 •

edited

Loading

ggalmazor commented Jan 8, 2020 •

edited

Loading

lognaturel left a comment •

edited

Loading

ggalmazor commented Jan 9, 2020 •

edited

Loading

ggalmazor commented Jan 13, 2020 •

edited

Loading

ggalmazor commented Jan 14, 2020 •

edited

Loading