Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RESP3 Support #2353

Closed
uglide opened this issue Jan 26, 2023 · 39 comments
Closed

RESP3 Support #2353

uglide opened this issue Jan 26, 2023 · 39 comments

Comments

@uglide
Copy link

uglide commented Jan 26, 2023

Since RESP3 is slowly moving to production, I want to discuss adding RESP3 support in StackExchange.Redis.

We at @redis want to move this forward and are happy to contribute PR. Considering the massive impact and code changes required, RESP3 should be released as part of v3 release.

@mgravell @NickCraver Do you have any plans to work on it soon? Let's plan a design/brainstorming session to align on the implementation.

Feel free to contact me via email igor.malinovskiy [ at ] redis.com

@uglide uglide moved this to Todo in RESP 3 Backlog Jan 26, 2023
@NickCraver
Copy link
Collaborator

This is on our radar for @mgravell to look at in a spike when time allows, but we have no absolute dates.

@uglide
Copy link
Author

uglide commented Jan 27, 2023

@NickCraver Thanks for the reply. Please let me know if you can't work on it in the next couple of months. We can work on it with @shacharPash and submit a PR.

@mgravell
Copy link
Collaborator

mgravell commented Jan 27, 2023

The way I see it there's a few main parts here:

  • wire details: already implemented multiple times, just needs relocating
  • handshake changes: we need to perform the HELLO and either a: switch mode for the subsequent pipeline, or b: just await the HELLO completely before continuing (pragmatic, adds latency to connect, but... meh); in particular the latter applies when AUTH is used as a failure mode for HELLO
  • reader/writer switchover (mostly reader, IIRC); presumably polymorphic via some interface/base-class - not huge (but: per node) - i.e. a RespParser base class with Resp2Parser and Resp3Parser concrete implementations; or we could try to make one parser handle both protocols
  • changes to pub/sub to exploit single connection mode (perhaps also adding the id-fetch in fallback mode to support RESP2 notifications, although we could defer this)
  • API changes, in particular to ScriptResult to advertise the additional sub-kind metadata (I propose a second property, and shimming the first one so sub-kinds report a suitable old-style kind, so existing switch blocks don't suddenly work differently); I don't think (please correct me!) that we need any radical API changes on requests, just responses?
  • I do not propose to store/process result metadata at the moment - just skip that in the protocol layer until we have a use-case
  • validation, standalone and cloud

I'm happy to start putting a spike together, but: invitation for additional thoughts in the above?

@NickCraver
Copy link
Collaborator

One fun issue here is that we parallel instantiate the subscription connection under normal circumstances to connect ASAP, which we'd implicitly want to happen post-RESP decision here to determine if it's needed and if we did that globally would regress some scenarios (e.g. fast connections from Functions/Lambda where connect time is critical). I think we need to figure out a connection pattern where it's "no regressions, but you can opt-into assuming RESP3 from the connection options/string" in some way - thoughts on that front? Alternatively, we proactively connect the subscription and drop it but that creates other race/miss/dupe issues with the move itself on early connections.

For a particular endpoint, we may want to cache any RESP3-awareness as well (for reconnects) unlike we do today in all cases (prop on server?) but upon any error: discard that and fall back to slower.

I'll try to add questions as I think of them, but that's the big one eating at me.

@mgravell
Copy link
Collaborator

mgravell commented Jan 27, 2023 via email

@NickCraver
Copy link
Collaborator

I think we already support Redis server version via version=6.0 in the connection string, so if we assume Redis >= 6 means " default to RESP3" when explicitly configured to, that's the path? We don't actually assume v6 anywhere today, only after connecting. With $HELLO= being a way to still be v6 and disable RESP3 for any reason?

@uglide To set expectations: this would not be a v3 release on our side unless we had to make breaking changes. A major upgrade for strongly named libraries is painful for a myriad of reasons in .NET and we try to avoid that as much as possible.

@chayim
Copy link

chayim commented Feb 7, 2023

IMHO I don't believe we can rely on that assumtion. Redis >=6 doesn't mean RESP3, by default today. Today, the default within the Redis ecosystem: redis dockers, redis-stack dockers, and the default redis-server binary behaviour is RESP 3 (as of Redis 7.0.8).

But, yes - issuing hello calls such asHELLO 3 will put a supporting Redis server into RESP 3, and HELLO 2 will return the state to RESP 2 for the connection.

@NickCraver
Copy link
Collaborator

@chayim Can you please clarify the "Redis >=6 doesn't mean RESP3, by default today" means? Can you compile without it or something? We're trying to figure out what scenario this isn't true to determine the options path to take.

@chayim
Copy link

chayim commented Feb 7, 2023

Versions of the RESP protocol are specified by the hello command. Configuring redis to require a specific RESP version is a configuration file issue. Hence the clients having the ability to request a specific protocol version.

No version of redis, defaults to RESP3 today. However, clients can request it, due to the features that RESP3 helps enable.

@shacharPash can help here, especially since I know he's interested in being involved in this effort.

@mgravell
Copy link
Collaborator

mgravell commented Feb 7, 2023

We would still be responding / reacting to the result of HELLO; if it reports -ERR or -AUTH, then we would go old-school; if it reports with success but specified protocol 2 as negotiation, that's fine too. So it sounds like there isn't a problem : we can try HELLO based on our connection-string, but: we weren't suggesting switching until after the response from HELLO (in whatever form)

@NickCraver
Copy link
Collaborator

@chayim To clarify: we're talking about the client here, not the server. So this discussion is "when should we try HELLO by default?", e.g. when the server has been specified as version 6+ in our connection string we could make that assumption for attempting RESP3. Doing it in all cases would cause a failure and roundtrip for existing users which we can't do.

@chayim
Copy link

chayim commented Feb 7, 2023

Correct, we're having the same conversation. For an OSS Redis instance, you can start asking for RESP3 as of 6.0.

@mgravell
Copy link
Collaborator

mgravell commented Feb 7, 2023

"For an OSS Redis instance"

So to be explicit: is it the case that there are other v6+ builds where HELLO itself will fail? (if it works but reports protocol 2, that's fine)

@chayim
Copy link

chayim commented Feb 9, 2023

This is the proposed reserved field name, for the URI argument: redis/redis-specifications#12

@mgravell
Copy link
Collaborator

mgravell commented Feb 9, 2023

Handy, @chayim; we don't currently support the uri format (we should add it, though; the main stumbling point has been "single Vs multiple endpoints"), but the uri specifying it separately is a strong vote IMO for having it as a separate dedicated property in our own config API (perhaps with "infer defaults from the server version if that is specified" as a fallback if the protocol is not specified)

@chayim
Copy link

chayim commented Feb 9, 2023

@mgravell To make sure I grok - are you saying that some version of the connection URI allowing you to support multiple endpoints is valuable? Can you help further explain -is this to avoid probing something relying on the OSS Cluster API - and pre-build the map, or for something else? I'm fascinated!

@NickCraver
Copy link
Collaborator

@chayim We support a wider variety of connection cardinality than the URI format does today, for example connecting to multiple sentinels or cluster endpoints or just 2 random servers which may or may not be related and replicated at connection time or later, including our options which are for the multiplexer and not a single endpoint. Also: what if our named options collide now or later?

That set of conflicts were the big sticking points for integrating URL support in the baseline parser, the many:1 creates many complications and ambiguity questions in parser code. If it was handled separately and everything else was off the table (e.g. only the simplest of cases, none of our configuration options would be parsed), then it may be more approachable. I agree it's worth supporting if we can reasonably, but need to cleanly solve those things still. But, I'd imagine the #1 use case is ConnectionMultiplexer.Connect("redis://....") so the ambiguity and confusion is in play. It creates a bit of a cliff I think, because the moment you need to use one of our connection options, you need to ditch the format and switch over...or maintain 2 entirely different custom formats. If our conclusion is the latter, I'm not sure it's a net win because those options are very often used (especially abortConnect).

@mgravell
Copy link
Collaborator

mgravell commented Feb 9, 2023

if we forget the multiple endpoints problem, the features part can be solved in part, at least:

var options = ConfigurationOptions.Parse("redis://...");
options.SomethingSpecial = 27;
//...

but that then means the SomethingSpecial can't be specified via configuration; but as an alternative "we will at least parse those strings" aspect, I'm OK with adding it

@NickCraver
Copy link
Collaborator

@mgravell what happens when someone passes ConnectionMultiplexer.Connect("redis://localhost:6379?abortConnect=false")? That's my hesitation, I easily see this getting confused with our configuration options, the same set of problems from the initial round of looking at this - is that net better and less confusing for all users? We don't know all the value keys for Redis (or do we maintain that independently?) so I'm not sure we could even throw a useful error message it's in the wrong place...or am I thinking of that wrong?

@slorello89
Copy link
Collaborator

@NickCraver - I'd think you could mostly do a 1:1 mapping between the options in your configuration string/ config options and a Redis URI, the only real issues would be when there's a conflict etc. . .

rediss://foo:bar@localhost:6379?ssl=false&user=bob&password=barker

those conflicts would be relatively easy to isolate, and to continue to support > 1 cardinality of endpoints via query parameters, e.g.

redis://:password@redis-1:6379?endpoints=redis-2:6379&endpoints=redis-3:6379

would allow you to map multiple endpoints as is done now

@NickCraver
Copy link
Collaborator

@slorello89 I hear you, but I don't think that's a net win. What are we doing besides making it look kind of like a Redis URI? A few problems:

  1. That's not a URI, a URI is fundamentally a single resource locator, we're using it for multiple. There's also the implication that the first one is somehow more important than the others, which isn't true - they're all equal to us.
  2. How would we resolve a conflict? This can be a problem 2 years from now when users are on our library and a version of the server which supports some new argument they want to use - that's a non-trivial retroactive issue.
  3. When users pass an invalid argument we're not respecting today, we immediately tell them it's unrecognized and error so they can fix it instead of scratch their heads for why it's not working. We couldn't do that here, because it's an unknown and indefinite superset we do not control, so we'd have to just turn off any erroring.

Maybe we can solve these, but I'm really not convinced this is a net win. Just making it look like the URL but not actually behave like it does anywhere else is a solution, but I'm not sure to which problem. The entire point of the URI, as I understand it, is consistent behavior - if it behaves differently here, what are we doing? How are other libraries handling this?

@slorello89
Copy link
Collaborator

As you mentioned @NickCraver the dream of the Redis URI is to provide a more or less standard way of connecting to Redis. The "less" part is of course where all the pain comes from, and you bring up some really good points (particularly 1). @chayim could expand more, but most of the clients (even the ones we maintain) have an equivalent of the ConfigurationOptions struct - or some non-URI mechanism for connecting the client (often a couple parameters in the client's constructor).

Anyways, discussion of the Redis URI should probably be moved out of this thread (sorry I just wanted to add my 2¢ here)

@shacharPash
Copy link
Contributor

shacharPash commented Feb 26, 2023

Hey @mgravell, I would love to start working on it, where do you want me to start?
I can start with the handshake changes, just give me the green light.

@mgravell
Copy link
Collaborator

The problem is, I suspect this is an area where I'd have to do so much analysis to answer that and give pointers, that I'd basically have to do it to answer that. I'm not sure this is an area that lends itself well to new eyes :/

@shacharPash
Copy link
Contributor

Thanks for the feedback. I wanted to learn, and experiment, so I started coding, and took the following approach - which I think makes sense. I intend to continue and learn. Below is a link to my changes (not a PR) - but I'd love feedback, as I learn more about the project. This is only the beginning of course.

  • ConnectionMultiplexer.Connect takes string, add protocol=3 as a supported string option of configuration.

  • Add ‘string Protocol’ to ConfigurationOptions.

  • Change ConfigurationOptions::DoParse to support the protocol.

  • Add protocol version to the ConfigurationOptions::ToString().

  • Add protocol to DefaultOptionsProvider.

  • Add support for HELLO command.

  • In HandshakeAsync, if the protocol equal to 3, we will send HELLO 3 after the authentication. I'm trying to be generic (enough) here so that in the future we could think about a RESP 4, etc.

  • Create a test that validate that the response received is RESP 3: Create connections like usual, call hset+hrandfield and make sure we’re RESP 2. Then delete the connection, creates a connection using RESP, and validates that we receive a RESP3 response.

This is the beginning:

  • At first I would work with RawResult to make sure that the response is genuinely RESP 3.

  • Then, change the parser to parse results in RESP 3 protocol.

  • I intend to continue exploring but I’d love to hear your thoughts on this direction.

@NickCraver
Copy link
Collaborator

NickCraver commented Mar 5, 2023

@shacharPash If you think about a client, they don't want to care about protocols. They're using a client so that they don't have to know about protocols, or put more plainly: 99.9% of users have no idea what RESP2 is. That's the entire discussion above which this design isn't taking any advantage of. We want to do the right thing (and fall back) based on the server version, which is already a concept in connection strings, downstream assumptions, DefaultOptions providers, etc. - we don't need a new concept there because we already have what we need to derive whether we should try connecting on RESP3. In this case: Redis v6+ == RESP3 with RESP2 fallback. Now we could expose the automatic calculation (based on version, if any, probably nullable) and make it overridable in code, e.g. RESP2 on Redis Server v9 for example...but I'd only explore that if there was a practical need.

In general: simplicity is a huge feature, only get more complicated if we have to. When it comes to the surface a user sees this is even more true. I hope that context helps better frame the discussion above and why we're talking about defaults and when to do what.

One other point for learning: The HELLO command supports auth as an argument, it need not be 2 commands in the RESP3 path :)

Edit: one more item for learning here: on the API front we already have protocol settings for SSL/TLS on the client. For these types of situations if we were to expose a protocol setting for RESP, we'd need to be much more specific with the naming. One thing I've learned is descriptive and even overly-descriptive APIs are so, so worth it. Every time I think of "ConnectionTimeout" on clients (including some in this one) that are an integer without a time unit...ugh :)

@mgravell
Copy link
Collaborator

mgravell commented Mar 5, 2023 via email

@shacharPash
Copy link
Contributor

First off thanks for the pointers @mgravell @NickCraver!

@NickCraver You're right - users of the library don't care about the protocol, nor should they. Looking at this, I can embrace the fall forward / fall back idea based on Redis version. My only concern is that RESP 3 isn't available in Azure, or Redis Enterprise as of 7.0 - even though it is in OSS.

@mgravell Thanks, I hadn't yet looked into pub/sub. I appreciate the reminder - and will keep this in mind when I get there.

@NickCraver
Copy link
Collaborator

My understanding at the moment is Azure does support this for 6+, are you testing and seeing otherwise? Super curious what that comment is based on because I'm seeing conflicting information here (haven't tested directly myself yet).

@chayim
Copy link

chayim commented Mar 6, 2023

@NickCraver Redis Enterprise currently doesn't have RESP3 support. From what I read Azure Cache for Redis cannot currently support RESP3 as a result.

@NickCraver
Copy link
Collaborator

@chayim There are multiple tiers of Azure Cache for Redis, only Enterprise would be affected by that dependency - we've tested against the non-Enterprise and are good to go.

Is there any context on why the Redis-hosted Enterprise flavor doesn't support this RESP3 on v6+? It's odd to me everywhere but the official host would support this and we're having to special case around that...normally you'd expect the exact opposite for adoption :)

@mgravell
Copy link
Collaborator

Work for RESP3 is starting in #2396; being realistic, for all the points raised above, I strongly feel this needs attention mostly by the primary maintainers, as there are a lot of gotchas that need considering

@NickCraver
Copy link
Collaborator

@chayim Is there any chance Redis Enterprise can support the HELLO command? Even if it only supports RESP2? In implementing this and talking through out options, Redis Enterprise being the only >= v6.0.0 endpoint we can't use HELLO on really complicates things. If it could respond that only RESP2 is supported, even that makes life a lot easier for clients. Having to special case it is going to be problematic and likely more connection roundtrips in all client cases. Thoughts?

@mgravell
Copy link
Collaborator

mgravell commented Mar 14, 2023

@chayim FYI for now we've gone for a pragmatic and lazy pipelined version of (in the "needs auth" scenario):

HELLO 3 AUTH {user} {password}
AUTH {user} {password}
CLIENT SETNAME {client}

which appears like it can be reliably pipelined and result in a happy outcome regardless of whether HELLO existed, or was unhappy - without requiring us to massively rework the handshake flow. Currently we deal with handshake replies entirely asynchronously and separately, so it was very awkward to have to stop and evaluate the results of the marginally more efficient version:

HELLO 3 AUTH {user} {password} SETNAME {client}

(I did check whether we could issue HELLO 3 without AUTH in an authenticated server - to follow up with AUTH immediately afterwards, but that is explicitly disallowed)

Likewise, without auth:

HELLO 3
CLIENT SETNAME {client}

vs

HELLO 3 SETNAME {client}

These aren't much different in terms of the underlying bytes, and: works consistently even when HELLO fails.

@chayim
Copy link

chayim commented Mar 15, 2023

@chayim Is there any chance Redis Enterprise can support the HELLO command? Even if it only supports RESP2? In implementing this and talking through out options, Redis Enterprise being the only >= v6.0.0 endpoint we can't use HELLO on really complicates things. If it could respond that only RESP2 is supported, even that makes life a lot easier for clients. Having to special case it is going to be problematic and likely more connection roundtrips in all client cases. Thoughts?

For Redis (Enterprise) 7.2 it will have to. But I haven't seen HELLO support on the roadmap for other supported versions. @uglide am I wrong?

@chayim
Copy link

chayim commented Apr 16, 2023

@mgravell I don't know if it's useful, but the work @shacharPash did while poking around is something he can share if you'd like. Is there anything else I might be able to share?

@mgravell
Copy link
Collaborator

@chayim we have a branch with the core protocol bits working; needs some attention on pub/sub re multi-connection topics, but: I think we know where we're going with it

@chayim
Copy link

chayim commented May 1, 2023

Solid. If you'd like, I have some dockers built (or a compose) that provide multiple environment types, including redis-servers running only in RESP3 mode. redis, requires a HELLO 3 to enter that mode, but for various testing scenarios, I found it useful. Here's the repo in case it's useful.

Also if there are other environments that would be useful, happy to make them available in the same fashion.

@NickCraver
Copy link
Collaborator

Closing this out, since we're merged in with #2396 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants