Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protobuf seems like a lot of overhead for this use case? #269

Closed
kentonv opened this issue Jun 19, 2018 · 24 comments
Closed

Protobuf seems like a lot of overhead for this use case? #269

kentonv opened this issue Jun 19, 2018 · 24 comments

Comments

@kentonv
Copy link

kentonv commented Jun 19, 2018

Hello. I'm the guy who wrote Protobuf v2 and open sourced it at Google. (I also, just in the last year, built Cloudflare Workers, a non-Node JavaScript server runtime embedding V8, with a focus on secure sandboxing and high multi-tenancy.)

I was surprised by the choice of Protobuf for intra-process communications within Deno. Protobuf's backwards compatibility guarantees and compact wire representation offer no benefit here, while the serialize/parse round-trip on every I/O seems like it would be pretty expensive.

I suppose the main motivation for using Protobuf here is to get convenient code for transmitting data that is compatible across languages?

Have you considered Cap'n Proto for this? I created Cap'n Proto shortly after leaving Google. It works a lot like Protobuf, except that the generated getters and setters operate directly on a backing byte array (e.g. ArrayBuffer in JavaScript), using a C-struct-like layout. Since you can directly share that byte array between code in different languages, you can construct a message in one language and then consume it in another without even making a copy in between, much less a serialize/parse round trip. Communication between a sandbox and a supervisor was in fact a motivating use case for Cap'n Proto.

Cap'n Proto is well-supported in C++, Rust, and Go. There is also a mostly-complete TypeScript implementation. Admittedly the TypeScript implementation has not received a whole lot of real-world usage, but there's interest I may be willing to directly adopt and support it for you. (I expect to need it in my own work soon anyway.)

@ry
Copy link
Member

ry commented Jun 19, 2018

@kentonv Thank you for pointing this out! To be honest, I was under the impression that the protobuf did operate on the underlying ArrayBuffer. Are there any recent benchmarks comparing PB to Cap'n Proto in JS? I will investigate Cap'n Proto. While you're here, do you have any opinion on FIDL?
(For streaming data, we intend to send raw ArrayBuffers back and forth, and avoiding serialization entirely. But better performance is always better...)

@MasterJames
Copy link

Maybe dumb question here but should not SSL handle compression sufficiently in a modern world and the extra cycles of encoding and decoding and uglify etc really not worth the exercise in _ in which it would just inflate when SSL and HTTP try to compress something compressed like it's not worth doing it twice or even counter productive.
With steaming you want to break everything into pieces in the end so maybe even at the file level but using an open socket also seems like breaking down a stream a balance of efficientcy (using existing transport regiment) and avoiding redundancy.

@ry
Copy link
Member

ry commented Jun 19, 2018

@MasterJames This is regarding internal communication between V8 and privileged controlling code - there's no SSL.

@whtiehack
Copy link

@ry Ask a digression. Are you sure you want to develop "Deno" in c++?

@kentonv
Copy link
Author

kentonv commented Jun 19, 2018

@ry With Protobuf you generally construct an object tree representing a message first, then you serialize it into an ArrayBuffer all at once, then you parse the buffer on the other side into another object tree, and then you consume that object tree. With Cap'n Proto, you directly construct the message in the ArrayBuffer. As in, when you call message.setFoo(123), it actually does something like uint32Array[offset] = 123 right then and there. Similarly, on the other end, you use the message directly from the backing buffer.

More importantly, though, the encoding does not try to do any variable-width encoding like Protobuf does. For example, Protobuf encodes integers in a variable-width format so that small integers encode to one byte. This is great when sending messages over a network but is a waste when communicating within the same address space.

Regarding benchmarks -- I've spent a lot of time benchmarking serialization systems, and unfortunately, my conclusion is that benchmark results are almost always meaningless. Different serialization schemes will tend to win on different kinds of inputs data due to their differing design trade-offs. Also, caching effects and working set size tend to be a huge factor in performance which is near-impossible to capture in micro-benchmarks. To have a really meaningful benchmark, someone needs to write two versions of a real-world application using two different serializations and compare them... but that's an enormous amount of work that almost no one ever does. Cap'n Proto, for its part, tends to show up in microbenchmarks as "infinity times faster" than Protobuf because serialization benchmarks tend to measure specifically the parse/serialize step, which Cap'n Proto doesn't have -- but of course that's not really a meaningful result either.

@whtiehack Cap'n Proto does not require C++.

@MasterJames
Copy link

Right of course I somehow knew that. Sorry just tuning in I'll try to keep quite.

@matiasinsaurralde
Copy link
Contributor

I've started to learn about Cap'n Proto after reading these comments, very interesting!
Couldn't continue my experiment because there were some issues with capnp-ts, seems that capnproto-js isn't an alternative as it's pretty much tied to C++/V8 code.
A simplified "translation" of our current schema could look like this:

@0xac5d44cb6d8d6291;

struct CapMsg {
    channel @0 :Channel;
    command @1 :Command;

    # START
    startCmdCwd @2 :Text;
    startCmdArgv @3 :List(Text);
    startCmdDebugFlag @4 :Bool;
    startCmdMainJs @5 :Text;
    startCmdMainMap  @6 :Text;

    # CODE_FETCH
    codeFetchModuleSpecifier @7 :Text;
    codeFetchContainingFile @8 :Text;

    # CODE_FETCH_RES
    codeFetchResModuleName @9 :Text;
    codeFetchResFilename @10 :Text;
    codeFetchResSourceCode @11 :Text;
    codeFetchResOutputCode @12 :Text;

    # READ_FILE_SYNC
    readFileSyncFilename @13 :Text;
    readFileSyncData @14 :Data;

    enum Channel {
        start @0;
        os @1;
    }

    enum Command {
        startCmd @0;
        codeFetch @1;
        codeFetchRes @2;
        readFileSync @3;
        readFileSyncRes @4;
    }
}

@kentonv
Copy link
Author

kentonv commented Jun 20, 2018

@matiasinsaurralde I recommend using a union here:

struct CapMsg {
    channel @0 :Channel;
    
    union {
        startCmd @1 :StartCmd;
        codeFetch @2 :CodeFetch;
        codeFetchRes @3 :CodeFetchRes;
        readFileSync @4 :ReadFileSync;
        readFileSyncRes @5 :ReadFileSyncRes;
    }
    
    struct StartCmd {
        cwd @0 :Text;
        argv @1 :List(Text);
        debugFlag @2 :Bool;
        mainJs @3 :Text;
        mainMap @4 :Text;
    }

    struct CodeFetch {
        moduleSpecifier @0 :Text;
        containingFile @1 :Text;
    }
    
    # ...
}

@benjamingr
Copy link
Contributor

Hey, at my previous job I switched a project from Protobuf to Cap'n' Proto and saw a big performance boost - that said, that was a very serialisation heavy project where the extra copy was proving to be the bottleneck.

I was under the impression that protobuf was used in deno to explicitly create a copy - but if that is not required I definitely think Cap'n Proto is worth considering. I think other alternatives I looked at at the time (like FlatBuffers) are also worth a look.

@kentonv
Copy link
Author

kentonv commented Jun 21, 2018

I was under the impression that protobuf was used in deno to explicitly create a copy

If a copy is needed for security, adding in a memcpy() will still be dramatically faster than serializing/parsing a protobuf.

A copy might be needed to avoid TOCTOU bugs if JavaScript could be manipulating the buffer concurrently in another thread. If JS is blocked while the message is processed, or if it can be processed carefully to avoid TOCOU, then you can avoid the copy.

@benjamingr
Copy link
Contributor

If a copy is needed for security, adding in a memcpy() will still be dramatically faster than serializing/parsing a protobuf.

Would love to see benchmarks on that @kentonv :)

@qti3e
Copy link
Contributor

qti3e commented Jun 21, 2018

@kentonv In initial days of deno, we were using a different struct for each message but it was adding ~20kb for each message to final typescript declaration file (see d765300), what happens here if we try to use a union?

@kentonv
Copy link
Author

kentonv commented Jun 21, 2018

@benjamingr You don't need a benchmark to prove that memcpy() is faster than any copying parser/serializer.

@qti3e I'm not sure exactly why Protobuf oneof would lead to code bloat, but I suspect you'll find that Cap'n Proto generated code is much smaller than Protobuf in general. Though, admittedly, I haven't tested the TypeScript implementation in particular, so this is something that should be verified.

@benjamingr
Copy link
Contributor

You don't need a benchmark to prove that memcpy() is faster than any copying parser/serializer.

Yeah but you would need one to show the impact of it on a similar app

@jorangreef
Copy link

Cap'n Proto is awesome. Great to see @kentonv involved here, recommending the switch himself!

@patlecat
Copy link

Cap'n Proto was exciting when it came out but has not seen any updates for years, the dev has shifted his focus on his products instead.

I would recommend Cereal instead: https://uscilab.github.io/cereal/index.html

@kentonv
Copy link
Author

kentonv commented Jun 27, 2018

@patlecat Hi. I'm the dev of Cap'n Proto. Cap'n Proto is actively developed. Cereal is C++-only, which makes it not very useful here.

@matiasinsaurralde
Copy link
Contributor

@kentonv I'm preparing a simplified project with that will help you reproduce the issue I'm getting with capnp-ts, will share the link later this week.
Thanks!

@matiasinsaurralde
Copy link
Contributor

matiasinsaurralde commented Jul 1, 2018

@kentonv I've been preparing a test project that uses Rust and the capnp-ts (the JS version) but couldn't reproduce the issue again, so: a) The TS version behaves differently (unlikely) b) Something was wrong on my initial implementation (very likely 😄)
Will ping you back in case I find anything, thanks.

@matiasinsaurralde
Copy link
Contributor

I've been trying Capnp on my fork, these are some numbers (building in release mode) doing 5000 readFileSync calls, no source code cache so the runtime takes some time to start:

Using Capnp:

% time ./target/release/reno hello.ts 
elapsed =  325

real	0m6.714s
user	0m7.488s
sys	0m0.784s

Using PB:

% time ./target/release/reno hello.ts 
elapsed =  128

real	0m1.489s
user	0m1.897s
sys	0m0.219s

capnp-ts might not be as optimized as the pbjs we use, same thing for the Rust library (capnp)?

@kentonv
Copy link
Author

kentonv commented Jul 1, 2018

@matiasinsaurralde It looks like you're using "packed" encoding. You shouldn't -- "packing" is basically a lightweight compression scheme tailored for Cap'n Proto, intended to make messages smaller on the wire at the cost of some CPU time, much like how Protobuf uses variable-width integer encoding. For intra-process communication, this wastes time (especially on the TS side, since JavaScript is pretty bad at byte twiddling). Try switching to regular encoding (remove "packed").

Note that for the very best case, you'd want to go a little further. A Cap'n Proto message is normally composed of multiple "segments", each of which has a backing ArrayBuffer. When you call toArrayBuffer(), there's a step that concatenates them all and adds an index. Ideally you'd pass the underlying array-of-ArrayBuffers instead, then load them on the receiving end, avoiding any copies at all.

Let me know how it goes. I'm travelling for the next week but will try to peak at my e-mail from time to time.

@matiasinsaurralde
Copy link
Contributor

@kentonv Thanks, I'm currently trying out the non-packed encoding (works faster), also reading more about how the segmentation works, it seems that the Rust library offers some options for this.

@ry
Copy link
Member

ry commented Jul 12, 2018

I evaluated a couple options which have similar goals: cap-n-proto, FIDL, and flatbuffers. Our requirements were that it supports C++, TypeScript, Rust, and builds with GN.

The initial work of replacing Protobufs with Flatbuffers is complete.

We consider this choice tentative. @piscisaureus and I have an uneasy feeling about some of FlatBuffer's design decisions. Particularly the vtable (I would prefer fixed size table, as cap-n-proto does) and how flatbuffers starts encoding at the end of the buffer (@piscisaureus described this to me, hopefully I'm relaying it correctly).

@ry ry closed this as completed Jul 12, 2018
@rw
Copy link

rw commented Jul 13, 2018

@ry FlatBuffers writes back-to-front so that reads can be front-to-back (children are written before their parents so that parents can be read before children).

piscisaureus pushed a commit to piscisaureus/deno that referenced this issue Oct 7, 2019
hardfist pushed a commit to hardfist/deno that referenced this issue Aug 7, 2024
This doesn't move the needle much on async ops that are submitted and immediately awaited upon, but it does offer slighly better latency when taking a promise from an async op and then waiting on it after performing other work.

One of the big refactorings in this PR is ensuring that async and sync fast ops share the same code path. The same refactoring can be applied to slow ops, but that would be slightly too much to tackle in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants