Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextDecoderStream leaks a native decoder resource if its stream errors #13142

Closed
h4l opened this issue Dec 19, 2021 · 6 comments · Fixed by #21074
Closed

TextDecoderStream leaks a native decoder resource if its stream errors #13142

h4l opened this issue Dec 19, 2021 · 6 comments · Fixed by #21074
Labels
bug Something isn't working correctly web related to Web APIs

Comments

@h4l
Copy link

h4l commented Dec 19, 2021

I've come across what seems to be a bug in TextDecoderStream which allows it to leak the native decoder used by its TextDecoder.

I've made a test module to demonstrate the issue (repro code is at the bottom of that page, under the output): https://gist.github.com/h4l/0199ab7cc24dd13536e01c5ea98b3ae7
The 3 tests trigger the test runner's resource leak detection (a really nice feature!).

What seems to be happening is:

  • TextDecoderStream uses a TextDecoder to decode its chunks
  • TextDecoder.decode() creates a native decoder resource, which it holds open when used in streaming mode. It closes the resource when a non-streaming decode() call is made.
  • TextDecoderStream makes streaming decode() calls in its transform() method, and makes a final non-streaming decode() call in its flush() method.
  • When a stream pipeline is errored, the flush() method of any Transformer in a TransformStream is not called, so in the case of TextDecoderStream it has no way to know its no longer in use, and keeps open its decoder.

I was looking through the streams spec when I encountered the leak (before I worked out the cause) to try to work out if I was misusing the streams in some way. It seems to me like an oversight in the spec that Transformers have no way to be told to close/clean up when a stream doesn't close cleanly. flush() is only called when the stream closes normally if I've not missed something, and there are no other lifecycle methods available for Transformers.

I can't see any idiomatic way to tell the Transformer to close, but one approach could be to wrap the readable and writable streams of the TransformerStream to watch for close/cancel/abort calls.

I've not made any previous contributions, but I'd be happy to help with a PR to fix this if it'd be useful.

@kitsonk kitsonk added bug Something isn't working correctly web related to Web APIs labels Dec 19, 2021
@h4l
Copy link
Author

h4l commented Jan 23, 2022

I played around with the Streams API a bit and came up with a fairly straightforward way to implement a TransformStream whose Transformer gets notified of stream aborts. Basically two parts:

Would you be interested in a PR for TextDecoderStream that fixes this issue by using a cut down/minimal version of the above to close the TextDecoder on aborts?

Or alternatively, it seems reasonable to me to suggest/request a change to the WHATWG Streams spec to give the controller of a TransformStream a property exposing an AbortSignal which could be used to handle stream aborts. WritableStreamDefaultController provides this already, but TransformStreamDefaultController doesn't. (And or to give Transformer objects an abort() method, like a WritableStream underlying sinks have.)

What do you think?

@crowlKats
Copy link
Member

crowlKats commented Jan 23, 2022

This seems to me like something that should be changed in the specification. It does seem to me like a good addition. My bad, flush should take care of that actually already

@h4l
Copy link
Author

h4l commented Jan 23, 2022

As I understand it, flush is only called when the stream closes normally, it's not called when the stream aborts: https://streams.spec.whatwg.org/#transform-stream-error
Or do you mean you think the spec should change to call flush when the stream aborts?

@h4l
Copy link
Author

h4l commented Jan 24, 2022

I'll bring it up with the Streams spec team.

h4l added a commit to h4l/issue_13142_text_decoder_stream that referenced this issue Jan 30, 2022
A TextDecoderStream that works around the Deno bug:
  denoland/deno#13142
@nkronlage
Copy link
Contributor

In case this helps anyone hitting this issue when breaking out of async iteration, I was able to work around it with a custom decoder transform stream:

  const MyTextDecoderStream = () => {                                                
    const textDecoder = new TextDecoder();                                           
    return new TransformStream({                                                     
      transform(chunk : Uint8Array, controller: TransformStreamDefaultController) {  
        controller.enqueue(textDecoder.decode(chunk));                               
      },                                                                                                                          
      flush(controller: TransformStreamDefaultController) {                          
        controller.enqueue(textDecoder.decode());                                    
      }                                                                              
    });                                                                              
  }; 

@nounder
Copy link

nounder commented May 10, 2023

Alternative solution with manual encoding from duplicated #19074

Deno.test("working alternative", async () => {
  const res = await fetch(
    "https://deno.land/std@0.186.0/json/testdata/test.jsonl"
  )

  const textDecoder = new TextDecoder()
  const reader = res.body!.getReader()
  const b = await reader.read()
  const t = await textDecoder.decode(b.value!)

  await reader.cancel()
})

mmastrac pushed a commit that referenced this issue Nov 12, 2023
…llation (#21074)

This PR uses the new `cancel` method of `TransformStream` to properly
clean up the internal `TextDecoder` used in `TextDecoderStream` if the
stream is cancelled.

Fixes #13142

Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
kt3k pushed a commit that referenced this issue Nov 17, 2023
…llation (#21074)

This PR uses the new `cancel` method of `TransformStream` to properly
clean up the internal `TextDecoder` used in `TextDecoderStream` if the
stream is cancelled.

Fixes #13142

Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
zifeo pushed a commit to metatypedev/deno that referenced this issue Nov 22, 2023
…llation (denoland#21074)

This PR uses the new `cancel` method of `TransformStream` to properly
clean up the internal `TextDecoder` used in `TextDecoderStream` if the
stream is cancelled.

Fixes denoland#13142

Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly web related to Web APIs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants