build: use clang to build release binary. #4138

PiotrSikora · 2018-08-14T00:51:10Z

Loading 1000 static listeners in a binary built with gcc takes ~51s,
but only ~1.5s (34x faster) in a binary built with clang.

Risk Level: Low
Testing: Manual
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Piotr Sikora piotrsikora@google.com

Loading 1000 static listeners in a binary built with gcc takes ~51s, but only ~1.5s (34x faster) in a binary built with clang. *Risk Level*: Low *Testing*: Manual *Docs Changes*: n/a *Release Notes*: n/a Signed-off-by: Piotr Sikora <piotrsikora@google.com>

PiotrSikora · 2018-08-14T00:51:27Z

cc @lizan @htuch @mattklein123

htuch · 2018-08-14T01:00:37Z

I'm not opposed to this, just wondering if there's any due diligence that's feasible to ensure we're not regressing elsewhere. Is it commonly accepted that modern Clang now produces more performance code than gcc? I know there were gcc holdouts for a long time because of the reverse, but times changes.

lizan · 2018-08-14T01:06:20Z

can you add a line into Release Notes?

mattklein123 · 2018-08-14T01:27:15Z

+1 to htuch. Can we do some more diligence here? I'm shocked it's this much faster. Do we know why? Can we profile the microbenchmark using cachegrind to see? In general I would like to switch to clang and also clang 6, but IMO we should understand a bit better the implications.

…

On Mon, Aug 13, 2018, 6:06 PM Lizan Zhou ***@***.***> wrote: can you add a line into Release Notes? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4138 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGA17Oh1csqXnLtSDe-P99X8metfdsLMks5uQiKRgaJpZM4V7n4m> .

mattklein123 · 2018-08-14T01:33:11Z

Also, if we are going to tweak release build settings I would love to revisit the current state of flto (whole program optimization).

…

On Mon, Aug 13, 2018, 6:26 PM Matt Klein ***@***.***> wrote: +1 to htuch. Can we do some more diligence here? I'm shocked it's this much faster. Do we know why? Can we profile the microbenchmark using cachegrind to see? In general I would like to switch to clang and also clang 6, but IMO we should understand a bit better the implications. On Mon, Aug 13, 2018, 6:06 PM Lizan Zhou ***@***.***> wrote: > can you add a line into Release Notes? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#4138 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGA17Oh1csqXnLtSDe-P99X8metfdsLMks5uQiKRgaJpZM4V7n4m> > . >

jmarantz · 2018-08-14T01:43:23Z

Quick sanity check: this is for a "-c opt" build?

PiotrSikora · 2018-08-14T02:28:17Z

@htuch / @mattklein123 the source of GCC slowdown is the construction of the LcTrie, and #4117 also fixes the issue, bringing loading time to 0.6s for both compilers, so I suspect that there is a difference in how std::vector::resize() is handled. @lizan has some other guesses, though.

@jmarantz yes.

htuch · 2018-08-14T02:29:50Z

Wouldn't std::vector be a libc++ vs. libstdc++ issue? But.. I think we use libstdc++ on Linux for both compilers, so that's probably not right.

mattklein123 · 2018-08-14T02:58:36Z

I think we use libstdc++ on Linux for both compilers, so that's probably not right.

Yeah pretty positive this is true.

mattklein123 · 2018-08-14T02:59:24Z

Anyway, IMO we should close this and have a larger conversation about:

Switch to clang
Switch to clang 6 and clang-format 6
Enable flto/whole program optimization

PiotrSikora · 2018-08-14T03:09:50Z

re 1. What's the point of closing this PR if we want to discuss this?
re 2. See #3937, I have clang-6.0 and clang-format-6.0 PRs in my local tree, but only for Ubuntu.

mattklein123 · 2018-08-14T03:15:49Z

re 1. What's the point of closing this PR if we want to discuss this?

Because IMO we should open an issue and discuss pros/cons there and an action plan. But it doesn't matter that much if we want to do it in the context of this issue. I do think that there should be some other perf tests run before we make the switch (not sure what they are, potentially we can smoke test a build at Lyft and @brian-pane @derekargueta @rgs1 can take a look at a clang build also)?

Also per @lizan definitely should have a release note on this. I agree that moving to 6.0 and considering flto can happen in a different issue/PR.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>

lizan · 2018-08-14T10:56:11Z

I did a bit more investigation on this, it looks like it is a bug that gcc didn't optimize LcNode to uint32_t. And seems it is fixed in somewhere between gcc 6.3 and 8.1.

Wrote a benchmark and here is the result:

Run on (32 X 2200 MHz CPU s)
CPU Caches:
  L1 Data 32K (x16)
  L1 Instruction 32K (x16)
  L2 Unified 256K (x16)
  L3 Unified 56320K (x1)
-------------------------------------------------------------------------
Benchmark                                  Time           CPU Iterations
-------------------------------------------------------------------------
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
StructVectorInitialization          24702330 ns   24687014 ns         28
Uint32VectorInitialization           1526238 ns    1526120 ns        456

clang version 5.0.1-2~bpo9+1 (tags/RELEASE_501/final)
StructVectorInitialization            376977 ns     376737 ns       1866
Uint32VectorInitialization            369947 ns     369918 ns       1917

gcc-8 (Ubuntu 8.1.0-5ubuntu1~16.04) 8.1.0
StructVectorInitialization            965249 ns     965177 ns        724
Uint32VectorInitialization           1523338 ns    1522336 ns        462

#4117 brings this to unnoticeable difference by removing the fixed the minimum 2000000 allocation. Perhaps a complicated listener with many filter chains will see the difference?

PiotrSikora · 2018-08-14T11:11:28Z

@lizan nice find, thanks! Since you already have the benchmark to reproduce this, what happens if you change the definition of LcNode to:

struct LcNode {
    unsigned int branch_ : 5;
    unsigned int skip_ : 7;
    unsigned int address_ : 20;
};

Does it it fix the issue with older versions of GCC?

PiotrSikora · 2018-08-14T11:22:39Z

Nevermind, I've missed the linked benchmark - using unsigned int doesn't fix the issue.

mattklein123 · 2018-08-14T16:58:11Z

@PiotrSikora we discussed in the community meeting and we would like to close this issue for now. Can you a) prep a PR to upgrade us to clang 6 (not controversial) and b) open an issue where we can discuss changing the default build to clang and a few other things like stack guards and flto? Thank you!

PiotrSikora · 2018-08-15T12:14:40Z

Done (#4157, with rest to follow), and done (#4158, #4159).

review: add release notes.

4ea7b83

Signed-off-by: Piotr Sikora <piotrsikora@google.com>

PiotrSikora mentioned this pull request Aug 14, 2018

Listener addition takes 0.3s istio/istio#7759

Closed

PiotrSikora mentioned this pull request Aug 15, 2018

Use clang to build release binary. #4158

Closed

PiotrSikora closed this Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: use clang to build release binary. #4138

build: use clang to build release binary. #4138

PiotrSikora commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

htuch commented Aug 14, 2018

lizan commented Aug 14, 2018

mattklein123 commented Aug 14, 2018 via email

mattklein123 commented Aug 14, 2018 via email

jmarantz commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

htuch commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

lizan commented Aug 14, 2018 •

edited

Loading

PiotrSikora commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

PiotrSikora commented Aug 15, 2018

build: use clang to build release binary. #4138

build: use clang to build release binary. #4138

Conversation

PiotrSikora commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

htuch commented Aug 14, 2018

lizan commented Aug 14, 2018

mattklein123 commented Aug 14, 2018 via email

mattklein123 commented Aug 14, 2018 via email

jmarantz commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

htuch commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

lizan commented Aug 14, 2018 • edited Loading

PiotrSikora commented Aug 14, 2018

PiotrSikora commented Aug 14, 2018

mattklein123 commented Aug 14, 2018

PiotrSikora commented Aug 15, 2018

lizan commented Aug 14, 2018 •

edited

Loading