-
-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: this refactors execution to properly use the process pool #352
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change does seem reasonable, although I would prefer to make it configurable as our test coverage is not complete as of now, and I would err on the side of not modifying the default runtime behavior, as this improvement has additional execution scope attached to it (it reads and writes temporary files)
Prior to this change the process pool was created and destroyed for every configuration, this would cause the cpu/memory to thrash and improperly allocate task to avaialble cpu, resulting in sometimes 25% utilization of available cpu resources The change corrects this, as well as tackles memory problems, by writing temporary results to disk and then reading them back at the end of the simulation. This is non configurable in this commit, and can also result in loading too much memory, as it does not include the ability to progressively or lazy load data into the final dataframe to complete the simulation. This commit is a WIP fixes cadCAD-org#351
This feature refactors the parallel processing to use lazy evaluation When simulation data grows, keeping the data in memory causes memory bloat, which causes too Ram to be consumed. The refactor uses lazy_flattens to keep the memory Python consumes low, the change was able to reduce consumption by a factor of 10 in many cases. The change also writes temporary files to disk before lazily rearranging the simulation results to fit the cadCAD expected format for a regular pandas dataframe.
@danlessa PTAL I included the memory profiler and a test headless script to allow you to play with different sizing, the initial results are quite good for lazy evaluation, so good that I don't think it'll be necessary to write results to disk. Or rather that can be another enhancement down the line. In the issue i'll redetail it for people. The main thing I believe is the datastructure of the results being returned from the simulation in memory. The nested arrays, seem to just eat it. I created a simple ~200kb simulation and addid a memory profile on execute, in what exists now in parallel mode it consumes 1gb of memory to to return back a result, with lazy evaluation its down to 117mb from a starting postion of 115mb. I stripped down the simulation to be able to get an assessment of the impact. I will rm the profiler and resubmit if things generally look good, I just wanted people to be able to check out the commit and play with it. The second thing here was that there are alot of checks for maybe "odd" or deprecated configs, my thinking was that this feature doesn't need to support everything that cadcad ever supported, so I simply rm'd those checks. If those have significant meaning or use case I can add those back, my thinking was that it was probably support for older configurations that were in use. Let me know what you think about this direction, and I will rm the memory part and keep on iterating. Additional results in the issue |
Wanted to add a clarifying comment here, when back filled with the df it expands to 846mb from 860 pre eval, but the lazy evaluation gives people a potential option to write to disk if they feel like it , it might be worth while to have easy run return a lazy object as well, to allow people to save the data without writing it all to disk or to use another system to store it for analysis like https://github.com/vaexio/vaex without hitting the memory limitations of applying transformations all at once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, still, we need the remove the profiler code
Prior to this change the process pool was created and destroyed for every configuration, this would cause the cpu/memory to thrash and improperly allocate task to avaialble cpu, resulting in sometimes 25% utilization of available cpu resources
The change corrects this, as well as tackles memory problems, by writing temporary results to disk and then reading them back at the end of the simulation.
This is non configurable in this commit, and can also result in loading too much memory, as it does not include the ability to progressively or lazy load data into the final dataframe to complete the simulation.
This commit is a WIP fixes #351