Protobuf seems like a lot of overhead for this use case? #269

kentonv · 2018-06-19T04:26:53Z

Hello. I'm the guy who wrote Protobuf v2 and open sourced it at Google. (I also, just in the last year, built Cloudflare Workers, a non-Node JavaScript server runtime embedding V8, with a focus on secure sandboxing and high multi-tenancy.)

I was surprised by the choice of Protobuf for intra-process communications within Deno. Protobuf's backwards compatibility guarantees and compact wire representation offer no benefit here, while the serialize/parse round-trip on every I/O seems like it would be pretty expensive.

I suppose the main motivation for using Protobuf here is to get convenient code for transmitting data that is compatible across languages?

Have you considered Cap'n Proto for this? I created Cap'n Proto shortly after leaving Google. It works a lot like Protobuf, except that the generated getters and setters operate directly on a backing byte array (e.g. ArrayBuffer in JavaScript), using a C-struct-like layout. Since you can directly share that byte array between code in different languages, you can construct a message in one language and then consume it in another without even making a copy in between, much less a serialize/parse round trip. Communication between a sandbox and a supervisor was in fact a motivating use case for Cap'n Proto.

Cap'n Proto is well-supported in C++, Rust, and Go. There is also a mostly-complete TypeScript implementation. Admittedly the TypeScript implementation has not received a whole lot of real-world usage, but there's interest I may be willing to directly adopt and support it for you. (I expect to need it in my own work soon anyway.)

ry · 2018-06-19T08:56:53Z

@kentonv Thank you for pointing this out! To be honest, I was under the impression that the protobuf did operate on the underlying ArrayBuffer. Are there any recent benchmarks comparing PB to Cap'n Proto in JS? I will investigate Cap'n Proto. While you're here, do you have any opinion on FIDL?
(For streaming data, we intend to send raw ArrayBuffers back and forth, and avoiding serialization entirely. But better performance is always better...)

MasterJames · 2018-06-19T09:20:10Z

Maybe dumb question here but should not SSL handle compression sufficiently in a modern world and the extra cycles of encoding and decoding and uglify etc really not worth the exercise in _ in which it would just inflate when SSL and HTTP try to compress something compressed like it's not worth doing it twice or even counter productive.
With steaming you want to break everything into pieces in the end so maybe even at the file level but using an open socket also seems like breaking down a stream a balance of efficientcy (using existing transport regiment) and avoiding redundancy.

ry · 2018-06-19T09:26:45Z

@MasterJames This is regarding internal communication between V8 and privileged controlling code - there's no SSL.

whtiehack · 2018-06-19T09:31:37Z

@ry Ask a digression. Are you sure you want to develop "Deno" in c++?

kentonv · 2018-06-19T17:50:28Z

@ry With Protobuf you generally construct an object tree representing a message first, then you serialize it into an ArrayBuffer all at once, then you parse the buffer on the other side into another object tree, and then you consume that object tree. With Cap'n Proto, you directly construct the message in the ArrayBuffer. As in, when you call message.setFoo(123), it actually does something like uint32Array[offset] = 123 right then and there. Similarly, on the other end, you use the message directly from the backing buffer.

More importantly, though, the encoding does not try to do any variable-width encoding like Protobuf does. For example, Protobuf encodes integers in a variable-width format so that small integers encode to one byte. This is great when sending messages over a network but is a waste when communicating within the same address space.

Regarding benchmarks -- I've spent a lot of time benchmarking serialization systems, and unfortunately, my conclusion is that benchmark results are almost always meaningless. Different serialization schemes will tend to win on different kinds of inputs data due to their differing design trade-offs. Also, caching effects and working set size tend to be a huge factor in performance which is near-impossible to capture in micro-benchmarks. To have a really meaningful benchmark, someone needs to write two versions of a real-world application using two different serializations and compare them... but that's an enormous amount of work that almost no one ever does. Cap'n Proto, for its part, tends to show up in microbenchmarks as "infinity times faster" than Protobuf because serialization benchmarks tend to measure specifically the parse/serialize step, which Cap'n Proto doesn't have -- but of course that's not really a meaningful result either.

@whtiehack Cap'n Proto does not require C++.

MasterJames · 2018-06-19T20:32:28Z

Right of course I somehow knew that. Sorry just tuning in I'll try to keep quite.

matiasinsaurralde · 2018-06-20T11:30:47Z

I've started to learn about Cap'n Proto after reading these comments, very interesting!
Couldn't continue my experiment because there were some issues with capnp-ts, seems that capnproto-js isn't an alternative as it's pretty much tied to C++/V8 code.
A simplified "translation" of our current schema could look like this:

@0xac5d44cb6d8d6291;

struct CapMsg {
    channel @0 :Channel;
    command @1 :Command;

    # START
    startCmdCwd @2 :Text;
    startCmdArgv @3 :List(Text);
    startCmdDebugFlag @4 :Bool;
    startCmdMainJs @5 :Text;
    startCmdMainMap  @6 :Text;

    # CODE_FETCH
    codeFetchModuleSpecifier @7 :Text;
    codeFetchContainingFile @8 :Text;

    # CODE_FETCH_RES
    codeFetchResModuleName @9 :Text;
    codeFetchResFilename @10 :Text;
    codeFetchResSourceCode @11 :Text;
    codeFetchResOutputCode @12 :Text;

    # READ_FILE_SYNC
    readFileSyncFilename @13 :Text;
    readFileSyncData @14 :Data;

    enum Channel {
        start @0;
        os @1;
    }

    enum Command {
        startCmd @0;
        codeFetch @1;
        codeFetchRes @2;
        readFileSync @3;
        readFileSyncRes @4;
    }
}

kentonv · 2018-06-20T16:24:57Z

@matiasinsaurralde I recommend using a union here:

struct CapMsg {
    channel @0 :Channel;
    
    union {
        startCmd @1 :StartCmd;
        codeFetch @2 :CodeFetch;
        codeFetchRes @3 :CodeFetchRes;
        readFileSync @4 :ReadFileSync;
        readFileSyncRes @5 :ReadFileSyncRes;
    }
    
    struct StartCmd {
        cwd @0 :Text;
        argv @1 :List(Text);
        debugFlag @2 :Bool;
        mainJs @3 :Text;
        mainMap @4 :Text;
    }

    struct CodeFetch {
        moduleSpecifier @0 :Text;
        containingFile @1 :Text;
    }
    
    # ...
}

benjamingr · 2018-06-21T10:53:51Z

Hey, at my previous job I switched a project from Protobuf to Cap'n' Proto and saw a big performance boost - that said, that was a very serialisation heavy project where the extra copy was proving to be the bottleneck.

I was under the impression that protobuf was used in deno to explicitly create a copy - but if that is not required I definitely think Cap'n Proto is worth considering. I think other alternatives I looked at at the time (like FlatBuffers) are also worth a look.

kentonv · 2018-06-21T16:39:44Z

I was under the impression that protobuf was used in deno to explicitly create a copy

If a copy is needed for security, adding in a memcpy() will still be dramatically faster than serializing/parsing a protobuf.

A copy might be needed to avoid TOCTOU bugs if JavaScript could be manipulating the buffer concurrently in another thread. If JS is blocked while the message is processed, or if it can be processed carefully to avoid TOCOU, then you can avoid the copy.

benjamingr · 2018-06-21T17:32:50Z

If a copy is needed for security, adding in a memcpy() will still be dramatically faster than serializing/parsing a protobuf.

Would love to see benchmarks on that @kentonv :)

qti3e · 2018-06-21T17:34:17Z

@kentonv In initial days of deno, we were using a different struct for each message but it was adding ~20kb for each message to final typescript declaration file (see d765300), what happens here if we try to use a union?

kentonv · 2018-06-21T17:45:01Z

@benjamingr You don't need a benchmark to prove that memcpy() is faster than any copying parser/serializer.

@qti3e I'm not sure exactly why Protobuf oneof would lead to code bloat, but I suspect you'll find that Cap'n Proto generated code is much smaller than Protobuf in general. Though, admittedly, I haven't tested the TypeScript implementation in particular, so this is something that should be verified.

benjamingr · 2018-06-21T17:52:39Z

You don't need a benchmark to prove that memcpy() is faster than any copying parser/serializer.

Yeah but you would need one to show the impact of it on a similar app

jorangreef · 2018-06-26T07:57:42Z

Cap'n Proto is awesome. Great to see @kentonv involved here, recommending the switch himself!

patlecat · 2018-06-27T15:03:14Z

Cap'n Proto was exciting when it came out but has not seen any updates for years, the dev has shifted his focus on his products instead.

I would recommend Cereal instead: https://uscilab.github.io/cereal/index.html

kentonv · 2018-06-27T15:50:40Z

@patlecat Hi. I'm the dev of Cap'n Proto. Cap'n Proto is actively developed. Cereal is C++-only, which makes it not very useful here.

matiasinsaurralde · 2018-06-27T22:43:45Z

@kentonv I'm preparing a simplified project with that will help you reproduce the issue I'm getting with capnp-ts, will share the link later this week.
Thanks!

matiasinsaurralde · 2018-07-01T12:11:48Z

@kentonv I've been preparing a test project that uses Rust and the capnp-ts (the JS version) but couldn't reproduce the issue again, so: a) The TS version behaves differently (unlikely) b) Something was wrong on my initial implementation (very likely 😄)
Will ping you back in case I find anything, thanks.

matiasinsaurralde · 2018-07-01T14:55:01Z

I've been trying Capnp on my fork, these are some numbers (building in release mode) doing 5000 readFileSync calls, no source code cache so the runtime takes some time to start:

Using Capnp:

% time ./target/release/reno hello.ts 
elapsed =  325

real	0m6.714s
user	0m7.488s
sys	0m0.784s

Using PB:

% time ./target/release/reno hello.ts 
elapsed =  128

real	0m1.489s
user	0m1.897s
sys	0m0.219s

capnp-ts might not be as optimized as the pbjs we use, same thing for the Rust library (capnp)?

kentonv · 2018-07-01T17:06:22Z

@matiasinsaurralde It looks like you're using "packed" encoding. You shouldn't -- "packing" is basically a lightweight compression scheme tailored for Cap'n Proto, intended to make messages smaller on the wire at the cost of some CPU time, much like how Protobuf uses variable-width integer encoding. For intra-process communication, this wastes time (especially on the TS side, since JavaScript is pretty bad at byte twiddling). Try switching to regular encoding (remove "packed").

Note that for the very best case, you'd want to go a little further. A Cap'n Proto message is normally composed of multiple "segments", each of which has a backing ArrayBuffer. When you call toArrayBuffer(), there's a step that concatenates them all and adds an index. Ideally you'd pass the underlying array-of-ArrayBuffers instead, then load them on the receiving end, avoiding any copies at all.

Let me know how it goes. I'm travelling for the next week but will try to peak at my e-mail from time to time.

matiasinsaurralde · 2018-07-01T17:17:43Z

@kentonv Thanks, I'm currently trying out the non-packed encoding (works faster), also reading more about how the segmentation works, it seems that the Rust library offers some options for this.

ry · 2018-07-12T22:18:29Z

I evaluated a couple options which have similar goals: cap-n-proto, FIDL, and flatbuffers. Our requirements were that it supports C++, TypeScript, Rust, and builds with GN.

Flatbuffers and FIDL have GN support. Flatbuffers in Chromium and FIDL in Fuchsia. (We use GN because V8 has a very complex build defined in GN. It's also pretty great!) cap-n-proto does not support GN.
cap-n-proto and FIDL both support Rust. Flatbuffers does not - but there is ongoing work that is very close.
cap-n-proto and Flatbuffers both support TypeScript. FIDL does not.

The initial work of replacing Protobufs with Flatbuffers is complete.

We consider this choice tentative. @piscisaureus and I have an uneasy feeling about some of FlatBuffer's design decisions. Particularly the vtable (I would prefer fixed size table, as cap-n-proto does) and how flatbuffers starts encoding at the end of the buffer (@piscisaureus described this to me, hopefully I'm relaying it correctly).

rw · 2018-07-13T04:19:35Z

@ry FlatBuffers writes back-to-front so that reads can be front-to-back (children are written before their parents so that parents can be read before children).

This doesn't move the needle much on async ops that are submitted and immediately awaited upon, but it does offer slighly better latency when taking a promise from an async op and then waiting on it after performing other work. One of the big refactorings in this PR is ensuring that async and sync fast ops share the same code path. The same refactoring can be applied to slow ops, but that would be slightly too much to tackle in this PR.

matiasinsaurralde mentioned this issue Jun 22, 2018

Suggestion: Look into porting to Rust and using Tokio #205

Closed

ry closed this as completed Jul 12, 2018

rsp mentioned this issue Apr 15, 2019

Replace flatbuffers #2121

Closed

piscisaureus pushed a commit to piscisaureus/deno that referenced this issue Oct 7, 2019

feat: add readJson/readJsonSync for fs modules (denoland#269)

e4485d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protobuf seems like a lot of overhead for this use case? #269

Protobuf seems like a lot of overhead for this use case? #269

kentonv commented Jun 19, 2018

ry commented Jun 19, 2018

MasterJames commented Jun 19, 2018

ry commented Jun 19, 2018

whtiehack commented Jun 19, 2018

kentonv commented Jun 19, 2018

MasterJames commented Jun 19, 2018

matiasinsaurralde commented Jun 20, 2018

kentonv commented Jun 20, 2018 •

edited

Loading

benjamingr commented Jun 21, 2018

kentonv commented Jun 21, 2018

benjamingr commented Jun 21, 2018

qti3e commented Jun 21, 2018

kentonv commented Jun 21, 2018

benjamingr commented Jun 21, 2018

jorangreef commented Jun 26, 2018

patlecat commented Jun 27, 2018

kentonv commented Jun 27, 2018

matiasinsaurralde commented Jun 27, 2018

matiasinsaurralde commented Jul 1, 2018 •

edited

Loading

matiasinsaurralde commented Jul 1, 2018

kentonv commented Jul 1, 2018 •

edited

Loading

matiasinsaurralde commented Jul 1, 2018

ry commented Jul 12, 2018 •

edited

Loading

rw commented Jul 13, 2018

Protobuf seems like a lot of overhead for this use case? #269

Protobuf seems like a lot of overhead for this use case? #269

Comments

kentonv commented Jun 19, 2018

ry commented Jun 19, 2018

MasterJames commented Jun 19, 2018

ry commented Jun 19, 2018

whtiehack commented Jun 19, 2018

kentonv commented Jun 19, 2018

MasterJames commented Jun 19, 2018

matiasinsaurralde commented Jun 20, 2018

kentonv commented Jun 20, 2018 • edited Loading

benjamingr commented Jun 21, 2018

kentonv commented Jun 21, 2018

benjamingr commented Jun 21, 2018

qti3e commented Jun 21, 2018

kentonv commented Jun 21, 2018

benjamingr commented Jun 21, 2018

jorangreef commented Jun 26, 2018

patlecat commented Jun 27, 2018

kentonv commented Jun 27, 2018

matiasinsaurralde commented Jun 27, 2018

matiasinsaurralde commented Jul 1, 2018 • edited Loading

matiasinsaurralde commented Jul 1, 2018

kentonv commented Jul 1, 2018 • edited Loading

matiasinsaurralde commented Jul 1, 2018

ry commented Jul 12, 2018 • edited Loading

rw commented Jul 13, 2018

kentonv commented Jun 20, 2018 •

edited

Loading

matiasinsaurralde commented Jul 1, 2018 •

edited

Loading

kentonv commented Jul 1, 2018 •

edited

Loading

ry commented Jul 12, 2018 •

edited

Loading