Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memray memory profiler support for mrun command line tool #794

Merged
merged 18 commits into from
Jun 13, 2023

Conversation

tsmathis
Copy link
Collaborator

@tsmathis tsmathis commented Jun 7, 2023

This PR adds an additional optional flag to mrun: --memray, or alternatively -m, to enable memory profiling of a builder using the Memray Python memory profiling tool (Memray repo).

The profiler supports profiling of both single and forked processes. For example, spawning multiple processes in mrun with -n will signal the profiler to track any forked child process spawned from the parent process.

A basic invocation from a terminal would look something like mrun --memray on builder_dump_file.json. --memray is a Click boolean parameter (Click) and thus accepts the values 1, yes, y, on, and true, which convert to True, and 0, no, n, off, and false, which convert to False.

The output .bin file produced by Memray is dumped by default into the user's temp directory, which is platform/OS dependent. For Linux/MacOS this will be /tmp/ and for Windows the target directory will be C:\TEMP\.

The output file will have a generic naming pattern as follows: BUILDER_NAME_PASSED_TO_MRUN + BUILDER_START_DATETIME_ISO.bin, e.g., builder_dump.json_2023-06-06T16.14.19.272030.bin.

Following completion, the .bin output file can be converted into a flamegraph using the Memray CLI, i.e, memray flamegraph builder_dump.bin.

Further visualization and data transform examples can be found in Memray's documentation (Memray reporters).

@tsmathis tsmathis marked this pull request as ready for review June 7, 2023 02:36
@tsmathis tsmathis requested a review from munrojm June 7, 2023 16:04
@codecov
Copy link

codecov bot commented Jun 7, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.08 🎉

Comparison is base (ba7d1fd) 88.12% compared to head (50cef25) 88.20%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #794      +/-   ##
==========================================
+ Coverage   88.12%   88.20%   +0.08%     
==========================================
  Files          44       44              
  Lines        3562     3587      +25     
==========================================
+ Hits         3139     3164      +25     
  Misses        423      423              
Impacted Files Coverage Δ
src/maggma/cli/__init__.py 86.58% <100.00%> (+4.04%) ⬆️
src/maggma/cli/settings.py 100.00% <100.00%> (ø)

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@rkingsbury
Copy link
Collaborator

This looks useful, thanks @tsmathis! I just have a few comments:

  1. To keep maggma as light as possible, I think memray should be included in requirements-optional.txt rather than core requirements.txt, and excluded from setup.py. @munrojm let us know if you have other thoughts
  2. Please add a simple test. While I don't think we need to add a test of the memray output per se, we should have a test that verifies mrun works as expected when the memray flag is invoked, and also that specifying a non-default directory for the memory profile works
  3. Please add a section to the Running Builders docs explaining how this option works. The contents of your first post on this PR would be just about perfect

@tsmathis tsmathis marked this pull request as draft June 7, 2023 18:11
@tsmathis
Copy link
Collaborator Author

tsmathis commented Jun 7, 2023

Thanks for the comments, @rkingsbury! I will address your points before reopening!

@munrojm
Copy link
Member

munrojm commented Jun 8, 2023

I second what @rkingsbury has suggested. Other than that I’m good with merging.

@tsmathis
Copy link
Collaborator Author

  1. An additional flag ,-md, has been added to allow specification of a non-default output directory.
  2. Two tests added to test_init.py verifying that (1) mrun behaves as expected when memray is enabled, and (2) specifying a non-default directory works.
  3. A section ("Profiling Memory Usage of Builders") describing usage of the memray flag with mrun and its options has been added to the running builders docs

@tsmathis tsmathis marked this pull request as ready for review June 12, 2023 22:16

## Profiling Memory Usage of Builders

`mrun` can optionally profile the memory usage of a running builder by using the Memray Python memory profiling tool ([Memray](https://github.com/bloomberg/memray)). To get started, `maggma` will first need to be installed from source ([Maggma installation](https://materialsproject.github.io/maggma/#installation-from-source)) followed by `pip` installing Memray using `pip install memray`, or by installing the optional `maggma` requirements by using `pip install requirements-optional.txt` in the `maggma` base directory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it necessary to install maggma from source in order for this to work? Is it not possible to pip install maggma and then install the optional memray?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you are right, I will correct that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@rkingsbury
Copy link
Collaborator

Thanks for addressing comments @tsmathis , this looks great! I just have one small question about the installation requirements; otherwise I think this is ready to merge.

… doc in favor of just pip installing maggma plus memray
@rkingsbury rkingsbury merged commit c3c6a1f into main Jun 13, 2023
@rkingsbury
Copy link
Collaborator

Thanks @tsmathis !

@rkingsbury rkingsbury deleted the memray_profiler branch June 13, 2023 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants