Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please restore the loop param to staggered.staggered_race #124639

Open
bdraco opened this issue Sep 26, 2024 · 25 comments
Open

Please restore the loop param to staggered.staggered_race #124639

bdraco opened this issue Sep 26, 2024 · 25 comments
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@bdraco
Copy link
Contributor

bdraco commented Sep 26, 2024

Bug report

Bug description:

The following remove the loop param from staggered.staggered_race :

#124390
#124574
#124573

its wasn't scheduled for removal until 3.16 this never merged

CPython versions tested on:

3.12, 3.13, CPython main branch

Operating systems tested on:

No response

Linked PRs

@Eclips4
Copy link
Member

Eclips4 commented Sep 26, 2024

Hello!
Thanks for the report.

asyncio.staggered_race should not be used directly. It is a private API that is not documented anywhere in the documentation. If you want to use staggered_race, please install the 3rd party package async-stagger: https://github.com/twisteroidambassador/async_stagger.

@bdraco bdraco closed this as completed Sep 26, 2024
@Eclips4 Eclips4 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 26, 2024
@bdraco
Copy link
Contributor Author

bdraco commented Sep 26, 2024

Its a bit unexpected to have deprecation warnings in it given its not intended for public use. Its also not possible to use https://github.com/twisteroidambassador/async_stagger since it only supports python 3.11+ https://github.com/twisteroidambassador/async_stagger/blob/e09eeefd5d5c123d79abb8ac7f1b5583e8ff3203/setup.py#L30

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Sep 26, 2024

FWIW, there was no deprecation emitted. That got removed right before the PR got merged. @Eclips4 should we reopen this? I guess this is too much of a breaking change for a patch release.

@ZeroIntensity ZeroIntensity reopened this Sep 26, 2024
@ZeroIntensity ZeroIntensity added 3.12 bugs and security fixes 3.14 new features, bugs and security fixes labels Sep 26, 2024
@ZeroIntensity
Copy link
Member

@bdraco, we're going to add back the loop parameter for 3.12. (I'm not sure what the deprecation plan is, I think it's supposed to be 3.14?)

Anyways, this will allow use of eager_task_factory on prior aiohttp versions. However, we're officially marking staggered_race as a private API (and will rename it to _staggered_race). Please copy the current implementation into third parties for future use, as we're not guaranteeing any future support past 3.12 (or maybe 3.13, I think that's undecided).

@bdraco
Copy link
Contributor Author

bdraco commented Sep 26, 2024

Thank you!

@ZeroIntensity
Copy link
Member

@Eclips4 are you working on this? Or should I write a patch (sorry, should have clarified yesterday).

@kumaraditya303
Copy link
Contributor

we're going to add back the loop parameter for 3.12
we're officially marking staggered_race as a private API (and will rename it to _staggered_race).

Where was this decided? was this discussed at the ongoing sprints?

@ZeroIntensity
Copy link
Member

we're going to add back the loop parameter for 3.12
we're officially marking staggered_race as a private API (and will rename it to _staggered_race).

Where was this decided? was this discussed at the ongoing sprints?

On Discord. @ambv said it, I think?

@kumaraditya303
Copy link
Contributor

On Discord. @ambv said it, I think?

I don't agree with adding back loop parameter and ignoring it, it is wrong to do.

@Yhg1s
Copy link
Member

Yhg1s commented Sep 27, 2024

I'm sorry, but our backward compatibility and change policy (https://peps.python.org/pep-0387/) is clear in this. The original PR (#124390) makes changes that are not suitable for backport, and that includes both 3.12 and 3.13 (at this point). It's also not suitable to do in 3.14 without a deprecation period or an exemption to the policy (see the last item in https://peps.python.org/pep-0387/#basic-policy-for-backwards-compatibility), but that's @hugovk's call.

Please roll back the changes in 3.12 and 3.13, since releases are coming this tuesday.

@Eclips4
Copy link
Member

Eclips4 commented Sep 27, 2024

I'm sorry, but our backward compatibility and change policy (https://peps.python.org/pep-0387/) is clear in this. The original PR (#124390) makes changes that are not suitable for backport, and that includes both 3.12 and 3.13 (at this point). It's also not suitable to do in 3.14 without a deprecation period or an exemption to the policy (see the last item in https://peps.python.org/pep-0387/#basic-policy-for-backwards-compatibility), but that's @hugovk's call.

Please roll back the changes in 3.12 and 3.13, since releases are coming this tuesday.

Thank you, Thomas, for your feedback on this!
I understand that you would like to revert the changes (including the bug fix), is that correct?

@Yhg1s
Copy link
Member

Yhg1s commented Sep 27, 2024

I have no opinion on the bugfix, but the removal of a keyword argument from a non-private API can't happen in the backport (and can't happen in main without following PEP 387).

@bdraco
Copy link
Contributor Author

bdraco commented Sep 27, 2024

It would be nice to keep the bugfix, although the new code seems to cause aiohttp CI to hang on python 3.13 only (aio-libs/aiohttp#9311). I have not have been able to find the root cause yet though and it may be something we are doing wrong.

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Sep 27, 2024

@Yhg1s is it needed to revert the whole fix? The nicer option would be to add loop back to all versions with a 3.16 removal plan and just have it do nothing, or alternatively, we could use the private API of TaskGroup to make sure that the loop argument actually persists.

@kumaraditya303
Copy link
Contributor

I am adding back the loop param but the PR doesn't ignores it completely, it actually uses it. This is the best fix for the time being.

@kumaraditya303 kumaraditya303 added the 3.13 bugs and security fixes label Sep 28, 2024
@ZeroIntensity
Copy link
Member

FWIW, this doesn't affect 3.13 yet, because gh-124573 hasn't been merged.

@ambv
Copy link
Contributor

ambv commented Sep 28, 2024

I am adding back the loop param but the PR doesn't ignores it completely, it actually uses it.

That's what I suggested on Discord.

@gvanrossum
Copy link
Member

But IIUC there's a 3.12 release planned for Tuesday too, and that is affected.

I am still unclear on the use case for passing loop=. The test (which existed before this changed and which Kumar is proposing to restore as part of the fix here) just passes a loop that becomes the current loop by the time anything runs.

How is aiohttp using the loop= argument? I would love to see a test that creates a multi-loop scenario similar to what aiohttp uses, and that passes in older 3.12 releases. We can then check if the same test passes in updated 3.12+ branches.

This feels urgent because of the expected 3.12 release. If we can't get confidence that Kumar's hack (patching tg._loop) will actually work with the older aiohttp versions, we should roll back the 3.12 backport of #124390 (the 3.13 backport wasn't merged, so it's safe).

Maybe someone should at least try Kumar's #124700 with an earlier (unpatched) aiohttp version?

@kumaraditya303 @bdraco

@bdraco
Copy link
Contributor Author

bdraco commented Sep 28, 2024

How is aiohttp using the loop= argument? I would love to see a test that creates a multi-loop scenario similar to what aiohttp uses, and that passes in older 3.12 releases. We can then check if the same test passes in updated 3.12+ branches.

aiohttp uses aiohappyeyeballs which is essentially a vendored subset of loop.create_connection that skips the dns resolution since aiohttp has its it own resolver implementation. This was needed because loop.create_connection has no way to pass local_addr_infos and addr_infos

In aiohappyeyeballs 2.4.0 and older loop was passed https://github.com/aio-libs/aiohappyeyeballs/blob/v2.4.0/src/aiohappyeyeballs/impl.py#L96 as the loop passing was copied from an earlier version of loop.create_connection

happy_eyeballs_delay, loop=self)

The primary concern was that users who upgrade to a cpython with the new implementation that drops the loop arg would experience breakage.

It mitigate that risk, we removed passing loop in aiohappyeyeballs 2.4.1+ as we assume there are no use cases where loop needs to be passed (hopefully we are not wrong on that). The net result is this hopefully limits the potential breakage to users who have aiohappyeyeballs < 2.4.1 and upgrade their cpython.

I will test #124700 with aiohappyeyeballs 2.4.0 which was the last version to pass the loop arg.

@bdraco
Copy link
Contributor Author

bdraco commented Sep 29, 2024

I patched a production system using aiohttpeyeballs==2.4.0 and with cpython 3.12.4 by replacing asyncio/staggered.py with the staggered.py I checked out from #124700

  • I verified loop was passed down and set as tg._loop
  • I verified connections were established as expected.

@bdraco
Copy link
Contributor Author

bdraco commented Sep 29, 2024

Thank you for looking at this. I hope that I have explained it well enough.

@gvanrossum
Copy link
Member

Thank you for looking at this. I hope that I have explained it well enough.

Yes, thanks! So it really is the case that the loop passed in must be the currently running loop. Maybe instead of hacking tg._loop we should just check that? Put e.g.

assert loop is None or loop is events.get_running_loop()

(or better, some kind of check that raises a ValueError).

I tried modifying the test to use loop=events.get_running_loop() and it just hung; hitting ^C printed some weird errors.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 29, 2024
(cherry picked from commit e0a41a5)

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
kumaraditya303 added a commit that referenced this issue Sep 29, 2024
…124744)

GH-124639: add back loop param to staggered_race (GH-124700)
(cherry picked from commit e0a41a5)

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
@graingert
Copy link
Contributor

graingert commented Sep 29, 2024

What's the purpose of the loop kwarg here? Is it to avoid an expensive call to get_running_loop? If so adding the assertion will just remove that perf fix

@bdraco
Copy link
Contributor Author

bdraco commented Sep 29, 2024

What's the purpose of the loop kwarg here? Is it to avoid an expensive call to get_running_loop? If so adding the assertion will just remove that perf fix

At this point the concern is historical compatibility only as get_running_loop is so much faster in newer cpython.

@bdraco
Copy link
Contributor Author

bdraco commented Sep 30, 2024

The new implementation doesn't seem to handle multiple winners in the same iteration of the event loop as we received aio-libs/aiohappyeyeballs#100 today

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.12/site-packages/aiohappyeyeballs/_staggered.py", line 85, in run_one_coro
    assert winner_index is None  # noqa: S101
           ^^^^^^^^^^^^^^^^^^^^
AssertionError

I was able to write a test to simulate the problem https://github.com/aio-libs/aiohappyeyeballs/blob/e3519bbebf2069eee0aff0dfde50689c742ba97f/tests/test_staggered.py#L33

    loop = asyncio.get_running_loop()
    winners = []
    finish = loop.create_future()

    async def coro(idx):
        await finish
        winners.append(idx)
        return idx

    coros = [partial(coro, idx) for idx in range(4)]

    task = loop.create_task(staggered_race(coros, delay=0.00001))
    await asyncio.sleep(0.1)
    loop.call_soon(finish.set_result, None)
    winner, index, excs = await task

I ended up writing a new implementation for aiohappyeyeballs that should work on 3.8-3.13+ https://github.com/aio-libs/aiohappyeyeballs/blob/main/src/aiohappyeyeballs/_staggered.py

ZeroIntensity added a commit to ZeroIntensity/cpython that referenced this issue Sep 30, 2024
Yhg1s pushed a commit that referenced this issue Oct 1, 2024
…am (#124810)

* Revert "GH-124639: add back loop param to staggered_race (#124700)"

This reverts commit e0a41a5.

* Revert "gh-124309: Modernize the `staggered_race` implementation to support eager task factories (#124390)"

This reverts commit de929f3.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 1, 2024
…wnstream (pythonGH-124810)

* Revert "pythonGH-124639: add back loop param to staggered_race (pythonGH-124700)"

This reverts commit e0a41a5.

* Revert "pythongh-124309: Modernize the `staggered_race` implementation to support eager task factories (pythonGH-124390)"

This reverts commit de929f3.
(cherry picked from commit 133e929)

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Yhg1s pushed a commit that referenced this issue Oct 1, 2024
…ownstream (GH-124810) (#124817)

gh-124309: Revert eager task factory fix to prevent breaking downstream (GH-124810)

* Revert "GH-124639: add back loop param to staggered_race (GH-124700)"

This reverts commit e0a41a5.

* Revert "gh-124309: Modernize the `staggered_race` implementation to support eager task factories (GH-124390)"

This reverts commit de929f3.
(cherry picked from commit 133e929)

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Status: In Progress
Development

No branches or pull requests

8 participants