-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless prepared statements wrap DoPutPreparedStatementResult
with Any
which is differs from Go implementation
#5731
Comments
I think we should follow the Go implementation as they are likely to have a better understanding of what is correct in protobuf Any clients can wrap the ticket in Any if they want. |
I've replied to the arrow repo issue here apache/arrow-go#34 . It's not clear to me yet whether this needs to be fixed in the Rust client or the Go client, as it seems other implementations are also using the |
What other implementations are using the Any wrapper? Is it java? |
C++ Also, as I mentioned in my comment, it seems to be an established convention to |
In my mind the only thing that matters is conformance to the spec, regardless of language convention. If the spec is ambiguous on this point then its not surprising we ended up here (I have not had a chance to review the spec so im not sure). If it's not ambiguous then my preferred approach would be to have all implementations update to conform with the spec, and then the spec can be revisited / updated if needed after that. If it is ambiguous on this point then I think the community will need to align on the appropriate next steps (formalizing conventions / updating spec / etc.) |
Following up on this - I have reviewed the Flight and FlightSQL spec and from what I can tell it is prescribed when
Of course let me know if I am mistaken here but based on this my preference would be to update the Rust implementation by removing the |
I can put together a PR if there is agreement on this. |
hi @alamb @erratic-pattern do you have thoughts on the above? |
I believe @erratic-pattern has some thoughts he should be sharing shortly |
I think this issue arises via a combination of factors:
My only preference is toward documenting what goes in The problem with the current consensus of removing the Even if we go and explicitly document each and every message type with its unambiguous This is a "path of least resistance" approach to making the minimal possible spec change, but ultimately makes the specification messy and thus harder to implement and maintain in the long term. So my recommendation would be to:
|
An additional remark on the 1st point above (Have a clear rationale for or against My preference would be no Depending on what clients currently do, this could be a much larger and more breaking change than just changing one particular message type. But, for example, if only one client is the odd one out in terms of |
@erratic-pattern thank you for the feedback. i understand and agree with your point - documenting explicitly what goes in That being said, I think we need to distinguish what should be done now to get the different implementations in a state where they are compatible with each other (or explicitly deciding we are okay with a transitory state of being incompatible) and what the next steps are to update the spec to have better defined behavior with regards to While we could hold off on making changes until there is agreement on the spec changes I think that leaves us in a quite bad spot as the current implementations are not compatible. My preference is to interpret the spec literally (i.e. The alternate route of course being that we leave things as they are in the current state, which is incompatible, and then work with the community to determine what changes are needed before making any updates. It's not clear to me that this could be done in a short period of time which is why my preference is to get conformance first and then update spec. But, we probably need opinions from other implementations as well - maybe its doable. CC @lidavidm and @zeroshade for their views |
My 2 cents is that type-erased payloads should always be Now I agree it is unfortunate that FlightSQL was designed in this way, gRPC provides a type-safe DSL and the layering of FlightSQL on top of Flight sacrificed many of these benefits, but the least we can do is ensure that users get a sensible runtime error.... This comment from @zeroshade on the linked PR makes me more confident of this
We should double-check what the Java driver does, however, given I know some people are using JDBC against the arrow-rs, I suspect it is also expecting the payload to be Any wrapped. |
Thank you @tustvold for the additional technical context on why The more important part in my mind is that behavior is consistent and not arbitrary per payload type. I don't think this should ever be the case in the specification, temporarily or long term.
I am not sure whether or not we need a stopgap measure such as this. If the change is as simple as updating the Go implementation and clarifying the documentation in the spec, then what you're suggesting would actually be more work than moving forward with the more ideal change. We would need to break the conventions used by all other implementations, add a lot of temporary documentation to the spec, and then remove it all in the "final" iteration. I will spend some time looking across language implementations to see what the predominant convention currently is, and that will give us a better idea the scope of change that needs to be made here.
I think the problem here is that there is ambiguity and absence on what to do here in the spec, which is why we are in a situation where we have divergent implementations. I think enforcing a literal interpretation of the spec, even temporarily, could potentially do more harm than good, especially if there is a predominant implied convention being used across implementations. In an ideal world the spec would be the definitive source of truth, but that is rarely the case. I think it is better to update the spec to document such implied conventions as they are discovered. The purpose of a system is what it does. |
I'm not trying to minimize the amount of work (I don't consider these large scale changes within the arrow libs although i acknowledge it could potentially have larger impacts on users of the libs) - I just want to 1. get us into a workable state where implementations are compatible and 2. get it right (update spec and all implementations to be aligned). If we can quickly get alignment across the ecosystem then great but im concerned were going to end up in a holding pattern where were stuck with incompatible implementations. But I guess we'll see what your analysis and the feedback from others comes out to be. |
I think in the current situation I think that unless we can get alignment across the ecosystem fairly quickly, it might make the most sense for us first to get the Go implementation into alignment with the others using What does everyone else think? |
@zeroshade Im definitely supportive of that - appreciate your flexibility. |
|
@lidavidm I plan on scheduling my team to try to tackle exposing the gRPC innards to the c++ implementation next quarter. |
@lidavidm @zeroshade @erratic-pattern thank you all for the feedback and helping get this to a short term productive outcome despite having to diverge from some prior language specific conventions. likely by tomorrow i should have time to create issues for both updating the go implementation and starting conversation on updating the spec. im not as familiar with the go arrow implementation but if needed i can put together a PR for that. |
Actually there is already issue in the main arrow repo so i dont think we need a new one there. |
@matthewmturner @erratic-pattern @tustvold @alamb @lidavidm Looks like I was slightly wrong. I just dug into the C++ code a bit more to confirm precisely where I needed to make the changes on the Go side to make sure we match and it looks like Rust is the odd-implementation out. Looking at https://github.com/apache/arrow/blob/main/cpp/src/arrow/flight/sql/client.cc#L223, https://github.com/apache/arrow/blob/main/cpp/src/arrow/flight/sql/client.cc#L250, and https://github.com/apache/arrow/blob/main/cpp/src/arrow/flight/sql/client.cc#L358 it appears that the C++ implementation matches the Go implemenation: Expecting that the So it looks like the quick solution would be for Rust to conform to what C++ and Go do, and then we can work on improving the spec and integration testing. |
I would suggest we confirm what the other implementations do, especially Java given the importance of JDBC, along with what the desired end state is, before we make breaking changes to any implementations. We do not want to be in a position where we make a breaking change only to revert it down the line. |
Java hasn't yet implemented it but the PR also did not wrap in Any. |
@lidavidm is correct. Currently the draft Java implementation does not wrap the result in Any. |
Am I correct in summarizing the consensus:
Did I get that right? |
I think that is an accurate description of the implementation state, although I would observe the somewhat arbitrary way some places wrap in Any and some don't is perhaps worth fixing, and if so it would be unfortunate to change the Rust implementation only to then change it back |
My personal opinion is that I value ecosystem compatibility over the short term inconvenience for any given implementation. But I know there isn't a clear right answer here for what to do, there's trade offs with approaches given the current state (migrate and potentially migrate back, or stay in incompatible state for unknown period of time). I just think it sort of defeats the point of having the ecosystem if the pieces aren't compatible. This is coming from someone with a go client and rust server who needs this compatibility and there is certain to be others with single language implementations who could be inconvenienced. Either way I have full trust in the arrow committers regardless of which decision is made so I don't have much more to add at this point. |
I think lets just change the Rust implementation to match the others, whilst I can dream of a FlightSqlV2 that is less of an incoherent mess, and perhaps uses gRPC directly to define the protocol instead of layering opaque payloads on top of flight, I personally don't have the time or inclination to drive such an initiative. As such, let the bodgery commence! 😅 |
I would like the same thing, and also just don't have the time to drive it, but glad we're on the same page :) |
Maybe we should start an issue to list the problems so we remember what we'd like to fix? |
I think the major missed opportunity is the use of opaque bytes payloads instead of defining the RPC signatures using concrete message types. Doing this would:
With the current design we get all the downsides of gRPC and few of the benefits. Creating a ticket to track/acknowledge this I think would be valuable |
Filed apache/arrow#41840 |
I have a PR for the arrow-rs change #5817 . @matthewmturner do you have an example I could use to test the Go client to Rust server compatibility? |
@erratic-pattern thanks for working on that. Our implementations are all internal. I'll need to think / look if theres a way we can easily test without pointing to these branches. |
|
|
Something similar appears to have happened again - #6545 I think we really need to either get integration tests for FlightSQL, or find a way to use gRPC to define the protocol instead of layering on top of opaque bytes payloads... - apache/arrow#41840 |
We do have an integration scenario for ingestion, though. It seems Rust was just never enabled for any of the Flight SQL tests: https://github.com/apache/arrow/blob/45b36976a31a4f363c0c471c2da1abd5bee97fd2/dev/archery/archery/integration/runner.py#L689-L703 |
Filed a barebones ticket to get those enabled - #6546 |
Additionally while I'm not sure it'll get completed anymore...there was an effort to (1) make the integration tests use raw gRPC instead of relying on a Flight client (to make it clear what the Flight implementation was doing under the hood/avoid any custom serializers that might cause issues) (2) simplify the tests down to the bare bones for easier implementation apache/arrow#44115 |
Describe the bug
Originally posted in the arrow repo - we are seeing differences in the Go and Rust implementation of stateless prepared statements. Specifically, the rust implementation wraps the
DoPutPreparedStatementResult
inAny
where the go implementation does not - which makes the two incompatible. From the conversation there it looks like they think the Rust implementation should be updated to no longer useAny
.To Reproduce
Make a stateless prepared statement request with parameter binding from Go FlightSQL Client to Rust Flight SQL Server.
Expected behavior
Compatibility between Rust and Go FlightSQL implementations.
Additional context
The text was updated successfully, but these errors were encountered: