-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise time taken to get raw graph #2312
Optimise time taken to get raw graph #2312
Conversation
2e8b527
to
08c0c3a
Compare
The main change is the restructure of the variable `SuiteConfig().edges`. It is now a dict. Each key is a unique sequence, and each value is a set of `(left, right, suicide, conditional)` tuples. This change allows `get_graph_raw` to look at each unique sequence once instead of multiple times. There is also caching at various points to reduce the number of expensive calls to regular expression matching and date-time functionality. Other minor changes include: * Remove unused variables. * Use compiled regular expressions where possible. * Reduce usage of complex regular expressions. * Remove edge and graphnode objects, which were not cheap to initialise. * New GraphNodeParser object to manage and cache graph node parsing. * The `recursive_replace` internal function is replaced with a loop at a level up.
08c0c3a
to
74d946e
Compare
add7f5b
to
7162266
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, no problems found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Some comments.
@@ -955,7 +957,7 @@ def compute_inheritance(self, use_simple_method=True): | |||
OUT.info("Parsing the runtime namespace hierarchy") | |||
|
|||
results = OrderedDictWithDefaults() | |||
n_reps = 0 | |||
# n_reps = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines can be removed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think these were being kept to support a commented-out comparison between alternative methods, that we should look at some time...)
lib/cylc/config.py
Outdated
|
||
expr_list = recursive_replace(expr_list, left, task_trigger) | ||
stack = [expr_list] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a comment here to explain what this block is doing.
@@ -721,6 +722,7 @@ def __init__(self, suite, fpath, template_vars=None, | |||
# "foo:fail => bar => !foo" looks like "foo => bar => foo"). | |||
graph = self.get_graph(ungroup_all=True, ignore_suicide=True, | |||
is_validate=True) | |||
GraphNodeParser.get_inst().clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
lib/cylc/config.py
Outdated
if lnode.offset_is_irregular: | ||
offset_tuple = (lnode.offset_string, seq) | ||
(str(first_point - last_point), seq)) | ||
elif offset_is_from_icp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if offset_is_from_icp:
...
elif offset_is_from_icp:
...
Has the is_absolute
logic been preserved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the old logic, is_absolute
is always set to the same value as offset_is_from_ict
. In the new logic, we have unified the variables as offset_is_from_icp
.
@@ -43,7 +43,7 @@ cmp_ok "${SUITE_RUN_DIR}/ctb-get-graph-raw.out" <<'__OUT__' | |||
], | |||
[ | |||
"t1.1", | |||
null, | |||
"T.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the consequence of this change in order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old logic was returning a list by using .values()
method of a dict. The new logic sorts the list to fix the ordering.
lib/cylc/graphnode.py
Outdated
self._offsets = {} | ||
self._nodes = {} | ||
self._offsets.clear() | ||
self._nodes.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two fewer W0612
messages.
Reduce validation time. This is part of the work to look at why it would take 30 minutes to validate a user suite (
cylc validate --strict
). This change reduces the validation time of the suite to10 minutes6 minutes4 minutes. Still not good enough, but moving towards the right direction.The main change is the restructure of the variable
self.edges
. It is now adict
. Each key is a unique sequence, and each value is aset
of(left, right, suicide, conditional)
tuples. This change allowsget_graph_raw
to look at each unique sequence once instead of multiple times. There is also caching at various points to reduce the number of expensive calls to regular expression matching and date-time functionality.Other minor changes include:
edge
objects, which were not cheap to initialise.graphnode
objects with aGraphNodeParser
singleton, which is able to cache of graph node parsing results.recursive_replace
function inSuiteConfig.generate_triggers
with awhile stack
loop at a level above the original loop.