Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS side code expensive for HTTP server #3546

Closed
kevinkassimo opened this issue Dec 25, 2019 · 4 comments
Closed

JS side code expensive for HTTP server #3546

kevinkassimo opened this issue Dec 25, 2019 · 4 comments
Labels
perf performance related

Comments

@kevinkassimo
Copy link
Contributor

kevinkassimo commented Dec 25, 2019

Using flamegraph, deno -A std/http/http_bench.ts yields the following graph:

deno_http.svg

Screen Shot 2019-12-24 at 5 27 19 PM

Some discovery (AsyncFunctionAwaitResolveClosure corresponds to our main for await loop):

  • A lot of time is still spent on TextEncoder.encode (center, 0x28236c3d6c48)

    • For our benchmark, related only to response header serialization (since we reuse body)
    • Creating a lot of small arrays, triggers frequent GC
    • Occasionally (not always) nontrivial chunk of ArrayPrototypeReverse. This happens to Stream in text_encoding.ts which I don't think is (along with slicing) necessary (in many cases we never push new data to it)
    • To match Node behavior, we added Date header that is sampled every time a request comes in. This is not necessary: instead we can just sample every few interval. However this requires a timer that could be unref (Node uses unref timer for this).
      • Unref-able ops are quite useful (as also have demonstrated in previous attempt to bring signal handlers to Deno
  • Uint8Array.subarray(...) / Builtins_TypedArrayPrototypeSubArray is quite expensive, and a lot of nontrivial chunks could be found spreading across the graph (see Use %TypedArray%.prototype.subarray for Buffer.prototype.slice nodejs/node#17431)

  • Object.assign() on very common objects (especially in createResolvable(...)) is expensive.

  • Async generators themselves are also having quite some overhead

  • On the rightmost of the whole graph there is a large chunk corresponding to freeing ArrayBuffers. We might have been abusing allocating new TypedArrays and for some internal places we might be able to just cache and reuse

@kevinkassimo kevinkassimo changed the title JS side code too expensive for HTTP server JS side code expensive for HTTP server Dec 25, 2019
@bartlomieju
Copy link
Member

Ref #2758

@bartlomieju
Copy link
Member

Just FYI, flamegraphs look very alike when you run deno_core_http_bench.

  • Uint8Array.subarray(...) / Builtins_TypedArrayPrototypeSubArray is quite expensive, and a lot of nontrivial chunks could be found spreading across the graph (see nodejs/node#17431)
  • Object.assign() on very common objects (especially in createResolvable(...)) is expensive.
  • Async generators themselves are also having quite some overhead
  • On the rightmost of the whole graph there is a large chunk corresponding to freeing ArrayBuffers. We might have been abusing allocating new TypedArrays and for some internal places we might be able to just cache and reuse

I'd advise to take a look at shared queue which uses a lot of subarray() as well as dispatches (json and minimal), which convert typed arrays to objects.

@bartlomieju bartlomieju added the perf performance related label Dec 30, 2019
@kitsonk
Copy link
Contributor

kitsonk commented Jan 2, 2020

@kevinkassimo we did a major decoding improvement in #3204, but we didn't touch encoding. Feels like we should certainly do that as well. #3204 increased performance to a point where JS clearly isn't the bottleneck.

@bartlomieju
Copy link
Member

Since this issue was opened we introduced a lot of performance improvements, including:

  • single threaded isolates and Tokio runtime
  • optimize read/write operations
  • optimize encoding/decoding

That makes presented flamegraph outdated, so I'm closing this issue.
(There's definitely more stuff to optimize, but ultimately we'll use Rust HTTP server that is bound in JS.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf performance related
Projects
None yet
Development

No branches or pull requests

3 participants