Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AbstractJackson2Decoder doesn't support decoding NDJSON arrays into Flux<List> #32579

Closed
yuflyud opened this issue Apr 5, 2024 · 4 comments
Closed
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Milestone

Comments

@yuflyud
Copy link

yuflyud commented Apr 5, 2024

Affects: 6.1.5

I'm attempting to implement a Reactive RestController that accepts NDJSON (Newline Delimited JSON) where each row may contain a JSON array. An example of such NDJSON payload is as follows:

[{ "recordType":"Patient", "name":"John Doe"},{ "recordType":"Visit", "state":"complete"}]
[{ "recordType":"Practitioner", "name":"Stephen Smith"}]

The desired behavior is to transform the request body into a Flux<List<Record>>, where each emitted list corresponds to the JSON arrays within each row. However, the current implementation, which is completely expected according to the documentation, only produces a Flux<Record>.

Below is the endpoint code I would like to be supported:

  @PostMapping(value = "/submit", consumes = {MediaType.APPLICATION_NDJSON_VALUE},
      produces = MediaType.APPLICATION_JSON_VALUE)
  public Mono<Void> submit(@RequestBody Flux<List<Record>> rows) {
    return rows.log()
        .then();
  }

Upon investigation, I've identified that the issue lies within the AbstractJackson2Decoder class. Specifically, the tokenizeArrays parameter is set to true in the following line:

Flux<TokenBuffer> tokens = Jackson2Tokenizer.tokenize(processed, mapper.getFactory(), mapper, true,
        forceUseOfBigDecimal, getMaxInMemorySize());

This configuration causes the NDJSON payload with nested arrays to be tokenized in a way that produces Flux<Record> instead of the desired Flux<List<Record>>.
At present, I cannot find a straightforward way to modify this behavior without resorting to copying and altering the entire source code of the AbstractJackson2Decoder class.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Apr 5, 2024
@jhoeller jhoeller added the in: web Issues in web modules (web, webmvc, webflux, websocket) label Apr 5, 2024
@rstoyanchev
Copy link
Contributor

Indeed, Jackson2Tokenizer.tokenize has a boolean flag tokenizeArrays that determines whether to flatten the array or not, and there are tests that parse the array both ways. However, we always pass true from AbstractJackson2Dedcoder and there is no way to change that.

It should be an easy change then, just a matter of how to decide when to set the flag. We could perhaps make it conditional on whether the media type is known as a streaming type. We have similar checks on the encoder side.

@rstoyanchev rstoyanchev added type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Apr 16, 2024
@rstoyanchev rstoyanchev added this to the 6.2.0-M2 milestone Apr 16, 2024
@rstoyanchev rstoyanchev self-assigned this Apr 16, 2024
@rstoyanchev
Copy link
Contributor

On second though, NDJSON implies a stream, and that should be streamed rather than collected to a List. So I'm not sure we can do anything transparently for your case. At best, we could provide some way to control it through a protected method.

Could you explain a little better your case and why you are collecting NDJSON stream to a list?

@rstoyanchev rstoyanchev removed this from the 6.2.0-M2 milestone Apr 16, 2024
@rstoyanchev rstoyanchev added status: waiting-for-feedback We need additional information before we can continue status: waiting-for-triage An issue we've not yet triaged or decided on and removed type: enhancement A general enhancement labels Apr 16, 2024
@yuflyud
Copy link
Author

yuflyud commented Apr 16, 2024

I'm not aggregating multiple NDJSON lines into a single list. Rather, each line represents a distinct item, which could be structured as a JSON array itself. This maintains the NDJSON format as a continuous stream of data.

In my use case, I need to handle multiple objects within the same NDJSON line as a single transaction. Therefore, I need the framework to automatically group these objects in the same way they are structured within an NDJSON format. For example:

[{"name": "Alice", "age": 25, "city": "San Francisco"}]
[{"name": "John", "age": 30, "city": "New York"},{"name": "Bob", "age": 35, "city": "Chicago"}]
[{"name": "Eve", "age": 28, "city": "Los Angeles"}]

I reviewed the NDJSON specification and did not find any explicit statement that prohibits a line from containing a JSON array instead of a single object. However, I wouldn't be surprised if this is indeed the case. Therefore, my request might deviate from the specification and may be considered invalid.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Apr 16, 2024
@rstoyanchev
Copy link
Contributor

Okay, I see now. We can decide then whether to set tokenize based on the target type. If the element type is an array or collection, then false (don't flatten), or true otherwise.

@rstoyanchev rstoyanchev added type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged or decided on status: feedback-provided Feedback has been provided labels Apr 17, 2024
@rstoyanchev rstoyanchev added this to the 6.2.0-M2 milestone Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

4 participants