Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis Sans-IO for Dual Async and Sync Support. #2039

Closed
seandstewart opened this issue Mar 8, 2022 · 11 comments
Closed

Redis Sans-IO for Dual Async and Sync Support. #2039

seandstewart opened this issue Mar 8, 2022 · 11 comments
Labels

Comments

@seandstewart
Copy link

Hey folks - I've been working hard on the performance issue the currently asyncio implementation faces when compared to https://github.com/aio-libs/aioredis-py/tree/v1.3.1

We're tracking this as part of the meta-issue aio-libs-abandoned/aioredis-py#1225 and is considered a top priority to encourage folks to adopt the async client. (Here's a direct link to the issue: aio-libs-abandoned/aioredis-py#1208)

As part of this work, I've also been cognizant of the overall maintenance burden of maintaining two client implementations in a single code-base. Using the principles laid out in https://sans-io.readthedocs.io/, I have taken a stab at implementing the core client logic in a Sans-IO, event-driven model. You can see the work here:

https://github.com/seandstewart/redis-py-sansio

I think this could be a very worthwhile endeavor to take on - it allows the codebase to diverge only in minimal ways for the specific IO implementations, as illustrated in the high-level client & connection modules.

Looking forward to more collaboration!

@seandstewart
Copy link
Author

cc: @chayim, @Andrew-Chen-Wang

I'd love to discuss this more, let me know your thoughts 👍

@Andrew-Chen-Wang
Copy link
Contributor

This looks great! Thanks so much for this! @seandstewart


Code-wise:

On an initial look through, it looks like the problem of ordering has been resolved really well. I love the variable docs. It was interesting reading through io module with the Events. A concern I have would be during a connection timeout / kill that was fixed in redis-py.

I'm not entirely sure what this line is waiting for; explanation would be great!

The PythonPaser has these readprimitive methods. Are these the only ones we need?

Nit picks:

Unless I'm not understanding the code properly, returning exceptions is a little... interesting? I'd expect explicit RedisError-inherited exceptions. Such as here: https://github.com/seandstewart/redis-py-sansio/blob/7400a38ae9fa92db7e1164a5731036675f7e3551/redis/sansio/_parser.py#L149 it would be nice to have

class RedisError(Exception):
    def __init__(self, exc=None):
        self.exc = exc
...
except Exception as e: raise RedisError(e)

Coming from a Django background, I'd prefer if things were explicit/verbose. For instance, many people already know what kw in kw_only means, but I'd still prefer the long kwargs, plural.


Organizationally, mostly for @chayim:

I think organizationally, the asyncio module can use the new code base as many people are getting used to the new code anyways + the interface hasn't changed much whereas the sync client would still default to the current implementation in redis-py with an additional option being the Sans IO implementation.


I will take a better look at this sometime in the last week of this month. Thanks again @seandstewart !!!

@chayim
Copy link
Contributor

chayim commented Mar 10, 2022

I really like this approach. Equally I echo explicit versus implicit choices. While personally I prefer implicit, for library and communal use I think explicit, and even verbose is more valuable.

I'd generally like to see the clients themselves become simpler, and rely on fixed features that are then included, and perhaps overridden re-implemented as necessary (clearly timeout is an example). I've been thinking of introducing a ConnectionFactory, as part of this - which in turn would make it easier for developers to discover connection types, though I don't yet understand how that would impact the async world.

@seandstewart First off, loving this. Second - WDYT?

@seandstewart
Copy link
Author

@Andrew-Chen-Wang @chayim -

Thanks for taking a look at this! I'm so glad to find us all in this place and I appreciate the positive response to these efforts. I'll try and respond to each question in turn 👍

  1. Missing timeout/ bug-fixes.
    • Yes, the IO implementations are not complete, they’re meant to be a starting place, there are missing features/patches.
    • An important feature of each is they are meant to be “safe” for concurrency primitives (task-safe for asyncio and thread-safe for syncio) without the use of locks. This was a learning from analyzing the aioredis@1.3 implementation, which is, honestly, brilliant.
  2. "What is the Parser implementation 'waiting' for?"
    • The pure-Python parser is implemented using a generator (here be 🐉). The wait...() methods force the generator to yield if there is no data in the parse buffer. This allows the main process to context-switch, which in turn opens other workloads within the process to fill the buffer.
  3. The PythonPaser has these read<primitive>() methods. Are these the only ones we need?
    • Yes, these are the primitives which are implemented by the RESP3 protocol.
  4. Returning Exceptions vs. Raising Exceptions.
    • The PythonReader is implemented to mirror the behavior of hiredis.Reader. This allows our protocol to not care which one is used, which simplifies the implementation.
    • There are three classes of error modes which may occur:
      1. ResponseError: always returned, it’s the parsed value of an ERR response.
      2. InvalidResponse: always raised, the server returned an un-parseable response (likely we’re not talking to a valid Redis server).
      3. Any other exception: always raised. The only reason we catch a bare Exception is to ensure we clear the buffer. We don’t want to wrap the error at this low level, that happens in the Protocol.
      4. Wrapping these errors would introduce a behavioral change between the two readers, which we want to avoid.
  5. “many people already know what kw in kw_only means, but I'd still prefer the long kwargs, plural.”
    • I think this is in reference to the kw_only parameter for attr.define. I don’t have any control over that. FWIW, Python 3.10 adopted this same parameter for its dataclass implementation. I’m not adamant that we use attrs for these data objects, but it makes the definition of slot-based classes much simpler. Once 3.10 is the LCD, I’d advocate we drop the dependency.
  6. Implicit vs. Explicit
    • I prefer explicit, which is what I think (hope) this approach provides - an explicit protocol for encoding and decoding Redis Server & Client events.
  7. ConnectionFairy for discovery of other types of connections.
    • This is interesting - I’d love to learn more.
    • One of the things I learned while working on this was that the class-level differentiation between SSL, UDF, and other connection implementations is they’re unnecessary. Their behavior is implemented without a high level of abstraction or duplication in this approach, and as a benefit, they reduce the burden of consumers of this library. More than anything, these are configuration values to pass onto the socket constructor.
    • When it comes to IO-specific implementations, I think I see a use case, but again, I’d love to learn more. If we do this right then it could be possible to provide not only a sync and async implementation, but a tornado or trio implementation with minimal cost.

Super stoked to be collaborating on this and can’t wait to see where this discussion leads! Hope these responses are helpful.

@synodriver
Copy link

Looks like we've done the same thing. I have just write a resp protocol parser in python. The difference is sioresp only contains a parser, without any io-related implementation.

@github-actions
Copy link
Contributor

This issue is marked stale. It will be closed in 30 days if it is not updated.

@Andrew-Chen-Wang
Copy link
Contributor

This should be re-opened

@kylebebak
Copy link

Agreed that this should be reopened, especially if there's still a gap in performance between https://github.com/aio-libs/aioredis-py/tree/v1.3.1 and redis-py

@chayim chayim reopened this Apr 30, 2023
@chayim
Copy link
Contributor

chayim commented Apr 30, 2023

Sorry folks, auto-closed, reopened.

@github-actions github-actions bot removed the Stale label May 1, 2023
@seandstewart
Copy link
Author

I haven't forgotten about this proposal! Things have evolved over the last year, but I would still love to find a path forward with this 🚀

Copy link
Contributor

This issue is marked stale. It will be closed in 30 days if it is not updated.

@github-actions github-actions bot added the Stale label May 17, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants