Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Event Hubs] Review the use of default DataTransformer #1742

Closed
ramya-rao-a opened this issue Mar 22, 2019 · 7 comments
Closed

[Event Hubs] Review the use of default DataTransformer #1742

ramya-rao-a opened this issue Mar 22, 2019 · 7 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Event Hubs

Comments

@ramya-rao-a
Copy link
Contributor

ramya-rao-a commented Mar 22, 2019

From the notes in #1481

We need to review how the default DataTransformer behaves and whether received message body is in expected format, when for the various cases where the message body is

  • a primitive type: string, number, boolean, null, undefined (this works)
  • a Buffer or AMQP typed objects (sent from other language libraries)
    • we receive the buffer, toString() it and then JSON.parse() it, so what was sent is definitely not what is received

For more notes, see #1481

@ramya-rao-a ramya-rao-a added Client This issue points to a problem in the data-plane of the library. Service Bus labels Mar 22, 2019
@mikeharder
Copy link
Member

An interesting case I found is Buffer vs Uint8Array. Buffer is sent as a byte array, while Uint8Array is sent as a (much larger) JSON string. I think it would be best if the API handled these the same way.

// Sent as byte[]
await sender.send({ bodyBuffer.alloc(1024) });

// Sent as JSON string
await sender.send({ bodynew Uint8Array(1024) });

@ramya-rao-a ramya-rao-a changed the title [Service Bus] Review the use of default DataTransformer [Event Hubs] Review the use of default DataTransformer May 3, 2019
@ramya-rao-a
Copy link
Contributor Author

Including this work to be done as part of the Event Hubs work we are planning for this month.
The changes would be done in amqp-common which is shared between Service Bus and Event Hubs

@ramya-rao-a
Copy link
Contributor Author

Review conclusions:

  • All primitive types (string, number, boolean, null, undefined), json objects, arrays and buffers get encoded and decoded properly. This is thanks to JSON.stringify() and JSON.parse()
  • Any AMQP typed objects sent by sdks in other languages are decoded properly only if the stringified buffer can be JSON parsed. TODO: Test all AMQP types
  • For message bodies that fail the JSON.stringify() function during encoding, the recommendation to users is to use Buffers. TODO: Update the error message with this recommendation
  • For message bodies that fail the JSON.parse() function during decoding, the body is returned as is. This is our catch all.

Regarding @mikeharder's point on Uint8Array:
@bterlson Are there any other buffer representations we should cater to?

@ramya-rao-a
Copy link
Contributor Author

From #7253, we have the problem of message body getting wrapped in quotes when the body is of type string which needs to be fixed as well

@richardpark-msft
Copy link
Member

We also need to consider for Service Bus:

  • We're doing track 2 at the moment, so we have the option of picking a different default behavior (looks like we'd have to change the default behavior since we decided to do string -> JSON here)
    • Another option is to create a new encoder/decoder. For EH it won't be the default but we could consider doing that for SB.
  • @ramya-rao-a - do you know what the end result was for @mikeharder 's question above? Can users bypass the encoder by just passing some form of array? There is always that workaround but from customer issues it appears that's not intuitive (at least for strings).
  • Need to make sure our cross-language story makes sense. If we go with "strings -> bytes" it's probably fine. If we go with anything more complicated interop is going to be annoying.

@chradek
Copy link
Contributor

chradek commented Oct 5, 2021

We received a request to support disabling the automatic JSON parsing on event bodies we do when receiving events. While we won't attempt to JSON parse an event that was sent as a byte array, it's not always possible for the same team to control how events are sent to Event Hubs. In high throughput cases, this can cause an unnecessary increase in CPU usage.

At the very least, we should allow disabling the JSON parsing - potentially with a flag on EventHubConsumerClientOptions or SubscriptionOptions. Would love to consider allowing users to specify their serialization/deserialization strategy as well.

Copy link

Hi @ramya-rao-a, we deeply appreciate your input into this project. Regrettably, this issue has remained inactive for over 2 years, leading us to the decision to close it. We've implemented this policy to maintain the relevance of our issue queue and facilitate easier navigation for new contributors. If you still believe this topic requires attention, please feel free to create a new issue, referencing this one. Thank you for your understanding and ongoing support.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Mar 25, 2024
@xirzec xirzec removed this from the Backlog milestone May 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. Event Hubs
Projects
Status: Done
Development

No branches or pull requests

6 participants