Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncGenerator → AsyncIterator #2381

Merged
merged 1 commit into from
Nov 8, 2024

Conversation

DimitriPapadopoulos
Copy link
Contributor

@DimitriPapadopoulos DimitriPapadopoulos commented Oct 15, 2024

Fixes #2377.

See microsoft/pyright#4741.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@DimitriPapadopoulos DimitriPapadopoulos marked this pull request as ready for review October 16, 2024 06:29
@paraseba
Copy link
Contributor

I think a better type for these functions would be:

async def list(self) -> AsyncIterator[str]:

Most implementations probably won't need these functions to be async, but it's a more general type for some implementation that may need it.

src/zarr/abc/store.py Outdated Show resolved Hide resolved
@DimitriPapadopoulos
Copy link
Contributor Author

DimitriPapadopoulos commented Oct 16, 2024

I think a better type for these functions would be:

async def list(self) -> AsyncIterator[str]:

The roadmap defines the Store API as follows and should be updated:

    async def list(self) -> List[str]:
        ...  # required for listable stores

    async def list_prefix(self, prefix: str) -> List[str]:
        ...  # required for listable stores

    async def list_dir(self, prefix: str) -> List[str]:
        ...  # required for listable stores

All current child classes use the prefix argument. I don't think it should be removed.

As for the return value, not sure why it changed from List[str] to AsyncGenerator[str, None] in #1844. Switch back to AsyncGenerator[str]?

@paraseba
Copy link
Contributor

All current child classes use the prefix argument. I don't think it should be removed.

Sorry, I just missed the argument in the example, definitely we shouldn't drop it. My point was about returning AsyncIterator instead of AsyncGenerator, the later is more of an implementation detail. We definitely don't want list because that would force the whole thing to live in memory at once.

@DimitriPapadopoulos
Copy link
Contributor Author

DimitriPapadopoulos commented Oct 16, 2024

Meanwhile, I tried to revert the return type from AsyncGenerator[str, None] to AsyncGenerator[str], to see what happens.

As for the AsyncGeneratorAsyncIterator change, how about addressing it in an issue and a separate PR?

@paraseba
Copy link
Contributor

As for the AsyncGenerator → AsyncIterator change, how about addressing it in an issue and a separate PR?

It's definitely not a big deal, AsyncGenerator[str] also works. But, doing it in separate PR would require changing the abc and every implementation twice.

@DimitriPapadopoulos
Copy link
Contributor Author

OK, let me try then. It's just that's I'll have to modify all child classes.

src/zarr/abc/store.py Outdated Show resolved Hide resolved
if p.is_file():
yield str(p).replace(to_strip, "")
allfiles = [str(p).replace(to_strip, "") for p in self.root.rglob("*") if p.is_file()]
return (onefile for onefile in allfiles)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to materialize the full allfiles list in memory. Can you use async for instead? Here is an example of how you could do this:

from collections.abc import AsyncIterator
from abc import ABC, abstractmethod

class Foo(ABC):
    @abstractmethod
    async def foo(self) -> AsyncIterator[str]:
        ...


class Bar(Foo):

    async def helper(self) -> AsyncIterator[str]:
        raise ValueError("")

    async def foo(self) -> AsyncIterator[str]:
        return (x async for x in await self.helper())

This code should work and mypy successfully, and doesn't materialize the results of calling helper.

Copy link
Contributor Author

@DimitriPapadopoulos DimitriPapadopoulos Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit lost here.

  • What we have here are functions that yield. That's a generator. Why create an iterator?
  • I'm afraid I don't have enough experience with async and no time to learn it in the short term. For example, I'm not even sure how to identify iterables and awaitables.

I suggest We keep AsyncGenerator for now. What would be the benefit of an AsyncIterator in this context?

@jhamman
Copy link
Member

jhamman commented Nov 5, 2024

@DimitriPapadopoulos - are you planning to return to this?

@DimitriPapadopoulos
Copy link
Contributor Author

I tried briefly, but I cannot get typing to work after rebasing. I need to take a thorough look.

@DimitriPapadopoulos
Copy link
Contributor Author

@DimitriPapadopoulos DimitriPapadopoulos changed the title Make methods of abstract base class async AsyncGenerator → AsyncIterator Nov 7, 2024
@DimitriPapadopoulos DimitriPapadopoulos marked this pull request as ready for review November 7, 2024 17:31
Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy here (except for the requested docstring changes).

Comment on lines 339 to 343
Notes
-----
This method should be async,
`How to correctly specify type hints with AsyncGenerator and AsyncContextManager <https://stackoverflow.com/questions/68905848/how-to-correctly-specify-type-hints-with-asyncgenerator-and-asynccontextmanager>`_
explains why it is not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to move this to a comment rather than the public docstring. Most implementations will inherit this and end users don't need to know about this implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wasn't sure about it. I put it in the documentation because it documents an inconsistency in function signatures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, I would document this using something similar to disabling Pylint warning W0236:

# pylint: disable=invalid-overridden-method

Unfortunately there's no equivalent rule in ruff, as far as I can see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@jhamman jhamman merged commit 693e11c into zarr-developers:main Nov 8, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent overridden method
3 participants