Generate flame chart for performance analysis #895

sandcha · 2019-07-02T14:58:02Z

Connected to #877

New features

Introduce openfisca test myTest.yaml --performance
- Generates a flame chart in a web page to view the time taken by every calculation in a simulation

To test it:

Run a yaml test:
openfisca test tests/formulas/irpp.yaml --performance -c openfisca_france
Run a Python web server with:
python -m http.server 5000
See its result in your browser at http://localhost:5000

When your yaml file contains multiple tests, only the last one is displayed in the flame chart.
You can use openfisca test --name_filter option to choose a specific test case.

alexsegura · 2019-07-03T16:11:38Z

Looks cool 🙂

What is the format of performance.json? Is it proprietary?
I mean, can it be consumed by another program, or only the HTML file with D3 is able to consume it?

Another implementation for the HTML rendering would be to communicate through WebSockets, so that the performance report is updated live! But it's more complicated.

Morendil · 2019-07-04T08:16:27Z

@alexsegura The format is quite straightforward, it's a JSON object representing a tree, with dictionary keys name, value and children (the latter recursively containing child nodes), as specified by the library we're using.

benjello · 2019-07-04T08:19:43Z

Is it possible to format the output in RunSnakeRun

Morendil · 2019-07-04T09:31:54Z

@benjello No, RunSnakeRun takes cProfile files as input, we don't know how to output that format.

bonjourmauko · 2019-07-04T10:31:40Z

😍

benjello · 2019-07-04T12:14:50Z

@Morendil there is also this other visualisation tool https://jiffyclub.github.io/snakeviz/

bonjourmauko · 2019-07-23T13:24:16Z

@Morendil I'd need some help for the review and rebase, great job though!

fpagnoux · 2019-07-24T20:55:03Z

Rebased

fpagnoux

That's great, I'm going to use this right now 🎉 🎉 🙂

fpagnoux · 2019-07-24T20:59:46Z

openfisca_core/simulations.py

@@ -106,13 +106,15 @@ def calculate(self, variable_name, period):
            period = periods.period(period)

        self.tracer.enter_calculation(variable_name, period)
+        self.tracer.record_start(time.time_ns() / (10**9))


Shouldn't the tracer call time.time_ns() rather than the simulation?

I'd say that's its responsibility, as it is in charge of measuring performances. That would also avoid needlessly calling time when we don't have a full tracer activated.

If record_start and enter_calculation are called at the exact same time, and have really similar names, is there really a point to make them 2 different methods?

I'm in favour of moving time.time_ns() to the tracer but:

we need to test how sub-calculations measure their durations (aka test_tracers.py::test_calculation_time_with_depth)

time precision / os clocks differ (so, a simulation on bulk data might need some other time unit) 🤔

record_start and enter_calculation have specific and different missions so I would keep them separated.

Besides, this is a first flame chart test based on timestamps but the chart values might take other performance information like the number of calls to the same variable which leads to a call to enter_calculation but norecord_start (Even if, I know, it's bad to involve a possible future to a PR 😬).

we need to test how sub-calculations measure their durations (aka test_tracers.py::test_calculation_time_with_depth)

In that case, can we make the timestamp an optional argument, so that:

the function by default uses the current timestamp

it can be overrun to use a specific timestamp

?

record_start and enter_calculation have specific and different missions so I would keep them separated.

Keeping them separated as 2 different functions in the tracers module makes total sense 👍 .

But do they really have different missions from the simulation's point of view ? The fact that the simulation needs to tell the tracer to first enter a calculation, then to record the time feels to me like the simulation is telling the tracer how to do its job. And this makes me feel like it's an SRP violation.

I also find that the names read very similar at first sight:

tracer.record_start : Record that something has started. Probably a calculation?

tracer.enter_calculation: Make the tracer enter a calculation. So tell the tracer a calculation has started, how is this different?

I don't really understand how capturing more information would lead to call enter_calculation but not record_start, since we decided call record_start in all cases, even if we are just reading from the cache.

This is not a hill I'm willing to die on, but I think we could have a single exposed method explicitly called record_calculation_start that would then call private methods to both record_start_time and enter_calculation.

I added a commit containing the 1st suggested change on this branch, and another commit addressing my 2nd remark
on a separate branch trace-performance-suggestion .

Feel free to adopt or drop the second one (I think it simplifies the tracer interface and keep responsibilities more separated, but up to you).

fpagnoux · 2019-07-24T21:00:44Z

openfisca_core/simulations.py


        try:
            result = self._calculate(variable_name, period)
            self.tracer.record_calculation_result(result)
            return result

        finally:
+            self.tracer.record_end(time.time_ns() / (10**9))


Same 2 comments than for record_start

openfisca_core/tracers.py

tests/core/tools/test_runner/test_yaml_runner.py

openfisca_core/tools/test_runner.py

openfisca_core/tracers.py

openfisca_core/tools/test_runner.py

openfisca_core/scripts/tools/index.html

tests/core/tools/test_runner/test_yaml_runner.py

guillett · 2019-08-30T11:57:31Z

openfisca_core/tracers.py

 from collections import ChainMap
+import importlib.resources as pkg_resources


❗ This removes the 3.6 backward compatibility

New in version 3.7.
From https://docs.python.org/3/library/importlib.html#module-importlib.resources

Thanks @Morendil for suggesting pip install importlib_resources

However, importlib_resources still requires this line to be changed into

import importlib_resources as pkg_resources

guillett · 2019-09-03T09:54:16Z

openfisca_core/tracers.py

+        return PerformanceLog(self)
+
+    def _get_time_in_sec(self) -> float:
+        return time.time_ns() / (10**9)


❗ This is not 3.6 compatible

New in version 3.7.
From https://docs.python.org/3/library/time.html#time.time_ns

sandcha requested review from fpagnoux, bonjourmauko, magemax, benjello and alexsegura July 2, 2019 15:37

sandcha mentioned this pull request Jul 3, 2019

Create a toolbox to check performance #877

Open

5 tasks

fpagnoux force-pushed the unify-tracing branch 2 times, most recently from 1723b70 to 791c448 Compare July 10, 2019 09:11

fpagnoux changed the base branch from unify-tracing to master July 10, 2019 14:59

bonjourmauko force-pushed the trace-performance branch from 8e928db to 3240f64 Compare July 23, 2019 13:21

bonjourmauko added the kind:feat A feature request, a feature deprecation label Jul 23, 2019

fpagnoux force-pushed the trace-performance branch 2 times, most recently from dc3901d to d325456 Compare July 24, 2019 20:54

fpagnoux force-pushed the trace-performance branch from d325456 to 11c2372 Compare July 24, 2019 21:39

fpagnoux requested changes Jul 24, 2019

View reviewed changes

Morendil reviewed Aug 20, 2019

View reviewed changes

tests/core/tools/test_runner/test_yaml_runner.py Outdated Show resolved Hide resolved

Morendil reviewed Aug 20, 2019

View reviewed changes

tests/core/tools/test_runner/test_yaml_runner.py Outdated Show resolved Hide resolved

sandcha and others added 23 commits August 22, 2019 19:14

Handle recursive call for performance log children

95bb233

Use high precision timer for profiling

3ba8547

Add performance option to openfisca test command

fc370b3

Improve performance test case

658fee8

Add tests for invocation of performance log

ca702b8

Move from performance print to json file

af8e34f

Add html file for performance flame graph

81546ba

Test index.html generation for performance analysis

059d778

Generate index.html alongside performance.json for easy viewing

30d5673

Set internal method to private

16798f0

Delete performance files after test

b1f66f8

Remove code examples from chart html page

4f7c890

Enable full tracer to generate performance graph

e4b060d

Move html page to assets directory

41b7424

Improve performance option description for openfisca test command

90333fb

Use variables for performance paths

00d17a7

Explicit method names

31dcc23

Move time measurement logic to tracer

24e1bbe

Improve graph labels and title

961b901

Simplify tracer interface

d4e1c31

Set record_calculation_start/end helpers to private

c18d61b

Use pythonic 'with' to open files

aeb1a47

Bump version number

193b7ef

sandcha force-pushed the trace-performance branch from f378921 to 193b7ef Compare August 22, 2019 17:16

sandcha merged commit 0cb6ef7 into master Aug 22, 2019

sandcha deleted the trace-performance branch August 22, 2019 17:18

guillett reviewed Aug 30, 2019

View reviewed changes

guillett reviewed Sep 3, 2019

View reviewed changes

bonjourmauko mentioned this pull request Aug 1, 2021

OpenFisca test --performance option doc missing openfisca/openfisca-doc#247

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate flame chart for performance analysis #895

Generate flame chart for performance analysis #895

sandcha commented Jul 2, 2019 •

edited by fpagnoux

Loading

alexsegura commented Jul 3, 2019

Morendil commented Jul 4, 2019

benjello commented Jul 4, 2019

Morendil commented Jul 4, 2019

bonjourmauko commented Jul 4, 2019

benjello commented Jul 4, 2019

bonjourmauko commented Jul 23, 2019

fpagnoux commented Jul 24, 2019

fpagnoux left a comment

fpagnoux Jul 24, 2019

fpagnoux Jul 24, 2019

sandcha Aug 20, 2019

sandcha Aug 20, 2019

fpagnoux Aug 20, 2019

fpagnoux Aug 20, 2019

fpagnoux Aug 21, 2019

fpagnoux Jul 24, 2019

guillett Aug 30, 2019

guillett Aug 30, 2019

guillett Sep 3, 2019

guillett Sep 3, 2019 •

edited

Loading

		from collections import ChainMap
		import importlib.resources as pkg_resources

Generate flame chart for performance analysis #895

Generate flame chart for performance analysis #895

Conversation

sandcha commented Jul 2, 2019 • edited by fpagnoux Loading

New features

alexsegura commented Jul 3, 2019

Morendil commented Jul 4, 2019

benjello commented Jul 4, 2019

Morendil commented Jul 4, 2019

bonjourmauko commented Jul 4, 2019

benjello commented Jul 4, 2019

bonjourmauko commented Jul 23, 2019

fpagnoux commented Jul 24, 2019

fpagnoux left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guillett Sep 3, 2019 • edited Loading

Choose a reason for hiding this comment

sandcha commented Jul 2, 2019 •

edited by fpagnoux

Loading

guillett Sep 3, 2019 •

edited

Loading