Hatchet analysis crossvariant #298

jonesholger · 2023-01-23T19:02:58Z

Allows cross-variant compares such as RAJA_OpenMP against Base_OpenMP

Extracts common subtrees only when number of nodes in the two trees is different
Adds a tolerance factor which produces a threshold when multiplied by the baseline inclusive sum
Adds a pass/fail check of report - baseline > threshold where threshold = baseline * tolerance (default tolerance = 0.05)
Does a metric check using Graphframe .inc_metrics and checks for either 'min#inclusive#sum#time.duration' or its alias 'Min time/rank' found in the newer hatchet
Switches out optparse in favor of argparse since optparse is deprecated
Propagates the minimum tuning of a set of tunings when extracting common subtrees

adrienbernede

I’m struggling to understand the implementation of the ExtractCommonSubtree.
At least it had me reading the hatchet documentation.

scripts/gitlab/hatchet-analysis.py

adrienbernede · 2023-02-17T13:22:00Z

scripts/gitlab/hatchet-analysis.py

+      generic_exc_metrics = gf.exc_metrics
+      generic_inc_metrics = gf.inc_metrics


Maybe we can pass those in the final init directly ?

Those metrics are new class variables which are copied over from graphframe argument, and eventually used as arguments to initialize the super class super().init, you can see that this initialization is all copies, sometime explicit as in generic_dataframe=gf.dataframe.copy(). Which allows me to create a new graphframe in this type of call

gf1 = GenericFrame(ht.GraphFrame.from_caliperreader(f1))

what GenericFrame is doing is renaming the root node to name "Variant" and setting all the attributes that makes it a "Node". This will allow Hatchet to compare completely different trees (since the root node itself is different), while assuming that the tree structure underneath in the two trees are completely identical. If they are not identical, then we extractCommonSubtree so we can compare apples to apples

adrienbernede · 2023-02-17T13:22:07Z

scripts/gitlab/hatchet-analysis.py

+      generic_graph = gf.graph.copy()
+      generic_exc_metrics = gf.exc_metrics
+      generic_inc_metrics = gf.inc_metrics
+      generic_default_metric = gf.default_metric  # in newer Hatchet


Unused here.

No they are used in the super call
super().init(generic_graph, generic_dataframe, generic_exc_metrics, generic_inc_metrics)

Are you talking about the default metric? We don't use that in ExtractCommonSubtree anymore since the metric can take one of two forms

#metric = "sum#inclusive#sum#time.duration"
metric = "Min time/rank"

The first is legacy metric which still exists BTW within Caliper and is used in all of the timing calculations, while the second form is a Human readable alias but processed as first class citizen within Hatchet

But we keep the Default metric as a class variable to maintain self similar structure to the inherited Graphframe.

adrienbernede · 2023-02-17T13:24:02Z

scripts/gitlab/hatchet-analysis.py

+      ii = generic_dataframe.index[0]
+      # fr = ht.frame.Frame({'name': 'Variant', 'type' : 'region'})
+      fr = ht.graphframe.Frame({'name': 'Variant', 'type': 'region'})
+      nn = ht.graphframe.Node(fr)
+      setattr(nn, '_hatchet_nid', ii._hatchet_nid)
+      setattr(nn, '_depth', ii._depth)
+      setattr(nn, 'children', ii.children)


Could you explain what is this function doing exactly ? I get that it’s sort of re-initializing the dataframe, but how and why ?

Explained above but I should illustrate the rationale with actual code and output. Say I don't use GenericFrame and instead call Hatchet directly on Base_Seq.cali and RAJA_Seq.cali like so

#!/usr/bin/env python3
import hatchet as ht
gf1 = ht.GraphFrame.from_caliperreader("RAJA_Seq.cali")
gf2 = ht.GraphFrame.from_caliperreader("Base_Seq.cali")

gf3 = gf2/gf1

print(gf3.tree())

and output (partial screen shots)

left frame i.e gf2 (with red arrows suggesting this only occurs in left frame)

and the right frame gf1 (with green arows suggesting only in the right frame but also showing up as NANs)

Now add in GenericFrame like so

metric = "Min time/rank"
gf4 = GenericFrame(gf1)
gf5 = GenericFrame(gf2)
gf6 = gf5/gf4
print(gf6.tree(metric_column=metric))

with it's screenshot showing the two variants are slightly different with RAJA incurring more overhead as expected

Success!

adrienbernede · 2023-02-17T15:26:23Z

scripts/gitlab/hatchet-analysis.py

+         if nn._depth == 3:
+            if common_subtree.dataframe.loc[nn, metric] < m3:
+               m3 = common_subtree.dataframe.loc[nn, metric]


I am not able extract the purpose of this.

Propagates the minimum tuning of a set of tunings when extracting common subtrees

It looks like nodes with depth > 3 are ignored.
But then, a post traversal will go through all the level 3 nodes of a given level 2 node. During this phase, m3 with be set to the minimal value encountered. What does "metric" (or tuning) represent and why extracting the minimum ?

The rest appears to be accumulating values from higher to lower levels.

Looking at the screenshot above the tunings for Algorithm_MEMCPY are default and library, so when we compare different subtrees the leaf nodes will have different names for the tunings for the respective variant, so the best value of the tuning set is propagated up the tree, which is the minimum time. It doesn't make sense when comparing RAJA_Seq to RAJA_CUDA where the CUDA algorithms could be run with a bunch of different block sizes (so lots of tunings) and Seq only has default. Which one to compare then .. we prefer to propagate the minimum in both trees. Otherwise the trees would be too dissimilar and Hatchet again will gives red/green arrows suggesting tree too different at subtree, and nonsense will be propagated. In Caliper an algorithm.tuning is the actual algorithm getting timed. Each algorithm.tuning is really an independent algorithm with a new name which we designate as algorithm.tuning.

adrienbernede · 2023-02-17T15:27:51Z

scripts/gitlab/hatchet-analysis.py

+            s2 = m3
+            s1 += s2
+            common_subtree.dataframe.loc[nn, metric] = s2
+            m3 = sys.float_info.max


If there is a level 2 node with no child, it looks like this node will get s2 = sys.float_info.max.
I must be missing something.

No we're just resetting m3 for the next pass. When you do a reduction (i.e for minimum) you initialize the values to float max, so the next value seen must be less than that, the subsequent values will be compared to that new minimum. When we arrive at nn._depth == 2, we're done with level 3 for that algorithm, so m3 gets reset. Since we rejected some tunings that were not the minimum we now have to redo all of the timing up the tree until we get to the root. Everything that is not at level 3 is summed like it would be originally.

What you're seeing at s2 = m3 is the result of the tree traversal at the prior level, the tree traversal will do all of the level 3 nodes before depth is level 2, so s2 will be set to the minimum value from level 3, not float max. The Caliper tree structure is

Every algorithm has at least one tuning, usually default, so there is no childless node at level 2.

Also, in general there are no node levels > 3, since the tree structure stops at level 3. Technically, however - there could be a case where MPI barrier timing is injected via additional service setup, and possibly see column metrics at level 4, which we currently don't care about. They'll be summed into the tuning at level 3 by Caliper anyways and are usually miniscule in the grand scheme of things.

adrienbernede

Thank you @jonesholger for the detailed explanation.

In fact, I think that your comments would make a great piece of documentation for this script. This script will be read by developers wondering why the associated job in the CI failed, but requires some knowledge of both Caliper 3 level output, and Hatchet tree structure management, which makes this script not so easy to decipher.

adrienbernede · 2023-03-06T20:59:24Z

Note that I haven’t been able to access LC GitLab lately to investigate the failures.

jonesholger · 2023-03-08T19:14:20Z

Thank you @adrienbernede for your comments regarding providing documentation for the script. I should take this to heart, but also consider adding some of this to the RAJAPerf tutorial that I'm currently working on. I know you're super busy, but you are becoming quite adept at RAJAPerf, and I was hoping to get you to Review my tutorial at some point when I get most everything fleshed out. Do you have or use Docker in a "local" context? If not I'm considering putting it all on BinderHub which will launch Jupyterlab automatically with the entire environment ready to go.

In addition I'm lobbying to get Hatchet improved so we don't have to use those support modules in order to compare across variants. But, that is a much longer conversation. There is also the counter argument for me to change RAJAPerf-Caliper to fit modern Hatchet, but this will increase the internal complexity and decrease robustness of the Caliper implementation which lives in the Base class - so I'm resisting. Besides I think this sort of scripting is best left to Python vs C++

jonesholger and others added 2 commits January 20, 2023 11:52

add support for cross-variant comparisons

5fa11bc

replace optparse with argparse

c46858e

jonesholger mentioned this pull request Jan 23, 2023

Woptim/caliper integration #291

Merged

adrienbernede reviewed Feb 17, 2023

View reviewed changes

adrienbernede approved these changes Mar 6, 2023

View reviewed changes

adrienbernede merged commit 3993588 into woptim/caliper-integration Mar 8, 2023

adrienbernede deleted the hatchet-analysis-crossvariant branch March 8, 2023 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hatchet analysis crossvariant #298

Hatchet analysis crossvariant #298

jonesholger commented Jan 23, 2023 •

edited

Loading

adrienbernede left a comment •

edited

Loading

adrienbernede Feb 17, 2023

jonesholger Mar 3, 2023

adrienbernede Feb 17, 2023

jonesholger Mar 3, 2023

adrienbernede Feb 17, 2023

jonesholger Mar 3, 2023

adrienbernede Feb 17, 2023 •

edited

Loading

jonesholger Mar 3, 2023 •

edited

Loading

adrienbernede Feb 17, 2023 •

edited

Loading

jonesholger Mar 3, 2023

jonesholger Mar 3, 2023

adrienbernede left a comment •

edited

Loading

adrienbernede commented Mar 6, 2023

jonesholger commented Mar 8, 2023

		generic_exc_metrics = gf.exc_metrics
		generic_inc_metrics = gf.inc_metrics

Hatchet analysis crossvariant #298

Hatchet analysis crossvariant #298

Conversation

jonesholger commented Jan 23, 2023 • edited Loading

adrienbernede left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrienbernede Feb 17, 2023 • edited Loading

Choose a reason for hiding this comment

jonesholger Mar 3, 2023 • edited Loading

Choose a reason for hiding this comment

adrienbernede Feb 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrienbernede left a comment • edited Loading

Choose a reason for hiding this comment

adrienbernede commented Mar 6, 2023

jonesholger commented Mar 8, 2023

jonesholger commented Jan 23, 2023 •

edited

Loading

adrienbernede left a comment •

edited

Loading

adrienbernede Feb 17, 2023 •

edited

Loading

jonesholger Mar 3, 2023 •

edited

Loading

adrienbernede Feb 17, 2023 •

edited

Loading

adrienbernede left a comment •

edited

Loading