Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise cycling objects #2322

Merged
merged 7 commits into from
Jun 13, 2017
Merged

Conversation

oliver-sanders
Copy link
Member

Efficiency changes which reduce the memory and CPU impact of cycling objects:

  • Avoid creating multiple parser instances.
  • Move regex compilation out of the __init__ method.
  • Slot classes.
  • Minor fiddling with exclusions objects (@dvalters could you check these changes).

In combination with slotting of the isodatetime library (metomi/isodatetime#74) this reduces the resource requirements of suites with multiple sequences defined in the graphing.

The following results were obtained running cylc validate over the following suite varying the value of sequences:

[scheduling]
    initial cycle point = 2000
    spawn to max active cycle points = true
    max active cycle points = 5
    [[dependencies]]
{% for seq_no in range(sequences | default(10) | int) %}
    {% set minutes = seq_no % 60 %}
    {% set hours = ((seq_no / 60) | int) % 60 %}
        [[[R{{cycles | default(1)}}/+PT{{hours}}H{{minutes}}M]]]
            graph = task
{% endfor %}

memory
cpu-time

I'll include this suite in the standard profile-battery tests in a future pull.
The impact on the runtime of the same suite:

memory
cpu-time

The complex suite shows a small reduction in memory.

@oliver-sanders oliver-sanders added the efficiency For notable efficiency improvements label Jun 9, 2017
@oliver-sanders oliver-sanders added this to the next release milestone Jun 9, 2017
@oliver-sanders oliver-sanders self-assigned this Jun 9, 2017
Copy link
Contributor

@matthewrmshin matthewrmshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@hjoliver
Copy link
Member

Nice result. Definitely include your new test suite in the profile-battery.

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one question (out of curiosity).

def __init__(self, excl_points, start_point, end_point=None):
super(ISO8601Exclusions, self).__init__(start_point, end_point)
self.p_iso_exclusions = set()
self.p_iso_exclusions = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind this change? I would have thought set was better, with the need for membership testing below...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this comment was meant to apply to self.exclusion_points)

Copy link
Member

@hjoliver hjoliver Jun 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, from a bit of googling, I guess set is implemented rather like dict, hence much higher memory consumption than list (makes sense)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Answered my own question, will merge ...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I changed this is because an empty set is much larger than an empty list:

Size in bytes for collections of None elements:

elements list set
0 72 232
1 96 248
2 104 248
3 112 248
4 120 248
5 128 248
6 136 248
7 144 248
8 152 248
9 160 248

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suite doesn't test effect on performance when exclusions are used, unless I'm missing something? (Although it tests 'empty'/no exclusions, which I suppose would be majority of the use cases?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no exclusions, which I suppose would be majority of the use cases

The vast majority of sequence objects will not have exclusions. In these cases we have sequence.exclusions = [] so the memory impact is minimal and p_iso_exclusions doesn't even come into it.

The reason for changing to a set was to make exclusion objects a little slimmer - it's pretty marginal I'll admit but many of these changes are, a few bytes saved here, a few there.

@hjoliver hjoliver merged commit e0f4cdf into cylc:master Jun 13, 2017
@oliver-sanders oliver-sanders deleted the optimise-iso8601-objects branch December 14, 2017 12:06
@oliver-sanders oliver-sanders mentioned this pull request Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
efficiency For notable efficiency improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants