Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job patterns for partitioning lists and mapping onto them #2297

Merged
merged 12 commits into from
Jun 24, 2024

Conversation

zulissimeta
Copy link
Contributor

@zulissimeta zulissimeta commented Jun 21, 2024

Summary of Changes

This draft PR adds job patterns that are common in high-throughput workflows. When running many jobs on the same flow (say result = map(my_job, range(1000))) , many workflow managers will get sad (network/db/load issues) with too many jobs. This PR is inspired by the dask bag partitions and map operator.

For example:

@job
def testjob(**kwargs):
    print(kwargs)

@flow
def testflow():
    num_partitions = 2
    result = map_partitioned_lists(
        testjob,
        num_partitions=num_partitions,
        test_arg_1=partition([1,2,3,4,5], num_partitions),
        test_arg_2=partition(["a", "b", "c","d","e"], num_partitions),
    )

testflow()

should yield:

{'test_arg_1': 1, 'test_arg_2': 'a'}
{'test_arg_1': 2, 'test_arg_2': 'b'}
{'test_arg_1': 3, 'test_arg_2': 'c'}
{'test_arg_1': 4, 'test_arg_2': 'd'}
{'test_arg_1': 5, 'test_arg_2': 'e'}

but run in only two jobs (instead of 5).

In addition, by using a specified number of partitions, logic flows can quickly move from one step to the next without waiting for any intermediate results.

many_results = generate_many_objects()

# Cannot continue until many_results is generated as the number of jobs to be generated is unknown
final_results = [analysis_job(result) for result in results]

while

many_results = generate_many_objects()
partitions_results = partition(many_results, num_partitions=10)
final_results = mapped_partitioned_lists(analysis_job, result=partitioned_results, num_partitions=10)

is fine since it is clear there is a first job, a second partition job that yields 10 tasks, and a final mapping job that has 10 jobs (one per partition).

@buildbot-princeton
Copy link
Collaborator

Can one of the admins verify this patch?

Copy link

codecov bot commented Jun 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.07%. Comparing base (6a5d2c8) to head (31e2a26).

Current head 31e2a26 differs from pull request most recent head 4cbae69

Please upload reports for the commit 4cbae69 to get more accurate results.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2297   +/-   ##
=======================================
  Coverage   99.07%   99.07%           
=======================================
  Files          82       83    +1     
  Lines        3445     3464   +19     
=======================================
+ Hits         3413     3432   +19     
  Misses         32       32           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Andrew-S-Rosen Andrew-S-Rosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @zulissimeta! This is an interesting PR.

If I understand correctly, it seems like this is logic that would need to be put within the @flows themselves, right? So, the ideal use case here is someone is importing a pre-made @job (e.g. quacc.recipes.mlp.core import relax_job) and making a custom @flow for themselves from that, right? It would perhaps be a bit difficult to justify when/where to add this logic within pre-made @flows in quacc itself (i.e. in quacc.recipes).

Some (non-substantial) comments below while I await your reply.

src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
tests/core/wflow/test_job_patterns.py Outdated Show resolved Hide resolved
tests/core/wflow/test_job_patterns.py Outdated Show resolved Hide resolved
src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
src/quacc/wflow_tools/job_patterns.py Outdated Show resolved Hide resolved
tests/core/wflow/test_job_patterns.py Show resolved Hide resolved
@Andrew-S-Rosen Andrew-S-Rosen self-assigned this Jun 22, 2024
@zulissimeta
Copy link
Contributor Author

Thanks, @zulissimeta! This is an interesting PR.

If I understand correctly, it seems like this is logic that would need to be put within the @flows themselves, right? So, the ideal use case here is someone is importing a pre-made @job (e.g. quacc.recipes.mlp.core import relax_job) and making a custom @flow for themselves from that, right? It would perhaps be a bit difficult to justify when/where to add this logic within pre-made @flows in quacc itself (i.e. in quacc.recipes).

Some (non-substantial) comments below while I await your reply.

Yes, exactly! If you want a flow that does many jobs in parallel (for example, if you made a bulk_to_adsorbates_flow that generated hundreds of possible adsorbate configuration and had ML potentials to make relaxations fast), this would be helpful.

Partitioning and batching would also be helpful if doing lots of inference; then you could partition a list, and apply a function to take batches and do ML inference quickly, rather than running one MLP relaxation as a separate job.

@Andrew-S-Rosen
Copy link
Member

Got it, thanks! Since it is pretty independent from existing recipe logic, I don't have much reservation about this. We will just want to add a brief section to the documentation somewhere highlighting how it can be used since it's somewhat of an "advanced" (but useful!) feature. I am happy to take care of the docs though.

@Andrew-S-Rosen
Copy link
Member

Thanks! Setting this to auto-merge now.

@Andrew-S-Rosen Andrew-S-Rosen enabled auto-merge (squash) June 24, 2024 17:28
@Andrew-S-Rosen Andrew-S-Rosen disabled auto-merge June 24, 2024 17:31
@Andrew-S-Rosen Andrew-S-Rosen merged commit c0ab453 into Quantum-Accelerators:main Jun 24, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants