Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX-#5045: Fix ray virtual_partition.wait with duplicate object refs #5058

Merged
merged 4 commits into from
Oct 4, 2022

Conversation

noloerino
Copy link
Collaborator

@noloerino noloerino commented Sep 28, 2022

Signed-off-by: Jonathan Shi jhshi@ponder.io

Introduces a new wrapper method for ray.wait that automatically gets num_returns to wait for every item in the list, and also removes duplicate object refs before passing it to ray.

What do these changes do?

  • commit message follows format outlined here
  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves BUG: virtual_partition.wait failed if there are duplicate refs #5045
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date
  • added (Issue Number: PR title (PR Number)) and github username to release notes for next major release

… object refs

Signed-off-by: Jonathan Shi <jhshi@ponder.io>
@noloerino noloerino requested a review from a team as a code owner September 28, 2022 23:30
Signed-off-by: Jonathan Shi <jhshi@ponder.io>
@codecov
Copy link

codecov bot commented Sep 28, 2022

Codecov Report

Merging #5058 (a31649a) into master (3ee4fa0) will increase coverage by 4.65%.
The diff coverage is 88.88%.

@@            Coverage Diff             @@
##           master    #5058      +/-   ##
==========================================
+ Coverage   85.00%   89.65%   +4.65%     
==========================================
  Files         253      254       +1     
  Lines       19130    19416     +286     
==========================================
+ Hits        16261    17408    +1147     
+ Misses       2869     2008     -861     
Impacted Files Coverage Δ
...mentations/pandas_on_ray/partitioning/partition.py 90.75% <50.00%> (ø)
modin/core/execution/ray/common/utils.py 91.37% <100.00%> (+0.47%) ⬆️
...ns/pandas_on_ray/partitioning/partition_manager.py 71.73% <100.00%> (+4.34%) ⬆️
...ns/pandas_on_ray/partitioning/virtual_partition.py 94.11% <100.00%> (+2.94%) ⬆️
modin/logging/config.py 94.59% <0.00%> (-1.30%) ⬇️
modin/experimental/batch/test/test_pipeline.py 90.21% <0.00%> (ø)
modin/pandas/series.py 94.53% <0.00%> (+0.23%) ⬆️
modin/pandas/groupby.py 93.77% <0.00%> (+0.23%) ⬆️
modin/pandas/series_utils.py 98.89% <0.00%> (+0.55%) ⬆️
...odin/core/storage_formats/pandas/query_compiler.py 96.40% <0.00%> (+0.55%) ⬆️
... and 39 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Signed-off-by: Jonathan Shi <jhshi@ponder.io>

Returns
-------
Tuple[List[ObjectIDType], List[ObjectIDType]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we return only one object here, i.e. just a list?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe if we specify the num_returns parameter, then we wait until all of the passed ObjectRefs are available in the object store.

cc @modin-project/modin-ray

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think in this case since we're waiting on every single object, all of them would be completed and this would return ([list of objects], ()). I kept the function like this so it would match with the behavior of ray.wait, though we don't actually use the return value of the function anywhere.

@YarShev YarShev merged commit 238a428 into modin-project:master Oct 4, 2022
@noloerino noloerino deleted the ray-wait-dup branch November 7, 2022 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: virtual_partition.wait failed if there are duplicate refs
3 participants