Dedicated benchmark server for running Jenkins Job #55

AndreasMadsen · 2017-01-04T11:01:19Z

Running benchmarks from the benchmark/ without external interference is quite difficult and time consuming. There has already been a few issues where we get false positives because of external interference. e.g. nodejs/node#10205.

The ideal solution is to just have a Jenkins job for running benchmarks. The Jenkins script has already been developed – nodejs/benchmarking#58.

The Benchmarking WG already has a dedicated server, however this is for monitoring benchmarks. In theory this could also be used to running benchmarks from benchmark/, however it is unknown how much time they take and the monitor benchmarks needs to be executed daily, which can not be guaranteed if benchmarks from benchmark/ are executed on the same server.

The simple solution is thus to just have a dedicated server for benchmark/.

It is unclear who will be responsible for this server, as the Benchmarking WG mostly focuses on monitoring performance on a daily basis – https://benchmarking.nodejs.org.

/cc @nodejs/benchmarking, @mscdex, @gareth-ellis, @mhdawson

Original issue: nodejs/benchmarking#58

The text was updated successfully, but these errors were encountered:

gibfahn · 2017-01-04T13:04:12Z

@CurryKitten FYI

addaleax · 2017-01-04T14:30:04Z

Also, probably @nodejs/build too

jbergstroem · 2017-01-04T14:37:43Z

Afaik, that machine is very idle. As long as we can guarantee scheduling I'm all for using it more. @mhdawson probably knows most about it.

AndreasMadsen · 2017-01-04T14:43:32Z

As long as we can guarantee scheduling I'm all for using it more.

Unfortunately we can't.

jbergstroem · 2017-01-04T14:52:08Z

@AndreasMadsen if everything is run from jenkins, it shouldn't be a problem. We could additionally check for a lock file at the server in a job (if the other jobs are controlled by cron)?

AndreasMadsen · 2017-01-04T16:58:21Z

@jbergstroem Please read the discussion in nodejs/benchmarking#58. I know nothing about Jenkins, but as I understand it we can't guarantee that a benchmark job runs for less that 24h. We can do some approximative checks and time estimations before we start the job, but it will not guarantee anything and it will only work for default parameters.

mhdawson · 2017-01-04T17:56:11Z

The concerns I've expressed, is that in the current form a single benchmark job could run for a very long time.

I may be interpreting the number wrong but if I multiply 60 * 985 s = 16 hours for just the http group, and that is only a small subset of the overall. This is from :
https://gist.github.com/AndreasMadsen/9f9739514116488c27c06ebaff813af1

@AndreasMadsen an I using your summary number correctly ? I see 985.64 for http and you mentioned needing 60 run. But 16 hours seems even longer than I could possibly expect for a reasonable result.

If I extended that calculation to include the rest of the categories, the job would run for days if not several weeks which does not seem reasonable.

AndreasMadsen · 2017-01-04T18:22:48Z

@mhdawson The calculation is correct. But it is very rare (I actually can't imagine why) that we would run the entire benchmark suite or the entire category. Doing that is arguably bad science, since it means that you aren't testing a specific hypothesis, you are just testing everything. Statistically testing everything (or just a category) is also difficult because of type 1 errors and to avoid that you would need more statistical confidence, which may require even more repetitions.

When I optimize something it typically involve:

I have a performance issue.
I create a fairly complex benchmark that highlights the issue (like a hello world http server).
I profile it, using a highlevel profiler.
I find the bad code path.
I create (or have in this case) a simple benchmark that highlights just the hot code path.
I profile it, using a detailed profiler.
I improve the code (I hope).
I run the simple benchmark and statistically tests that I improved the performance.
Repeat step 6-8 until some improvement is achieved.
I run the complex benchmark and statistically tests that I improved the performance.

During these steps I only used two benchmarks, the complex benchmark and the simple one.

complex benchmark example: benchmark/http/simple.js (ironically called "simple").
simple benchmark example: benchmark/http/check_is_http_token.js

This would "only" take 6-7h, which arguably may still be a long time, but that is also why we need the Jenkins job.

mhdawson · 2017-01-04T22:02:25Z

Ok so running just a subset makes sense to me. We might be able to do the following:

look to allow at most 12 hours a day for these kinds of runs.
control the launch of the jobs through a node app, bot whatever
have the app, bot, whatever queue up the runs, only running them during the 12 hours allotted.
have the app,bot whatever kill the job if it is still running at the end of the 12 hours and then run a job to clean up the machine (kill all node processes etc. -> this would be key)

This would require somebody to create the app, bot etc, integration with jenkins and might require a bit more effort on the part of those wanting to do perf runs but might be a balance we can make workable with the existing h/w.

It would be easier to have another dedicated machine but those are more costly than the vms we use for the rest of the jobs and we are pretty much at our softlayer spend which is where we got the first one from.

Trott · 2017-09-02T22:40:48Z

Closing this as a bit of cleanup for the CTC repo but feel free to re-open in the TSC repo or another appropriate repo.

AndreasMadsen mentioned this issue Jan 4, 2017

Prototype for running a subset of benchmarks on a pull request nodejs/benchmarking#58

Merged

Trott closed this as completed Sep 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dedicated benchmark server for running Jenkins Job #55

Dedicated benchmark server for running Jenkins Job #55

AndreasMadsen commented Jan 4, 2017

gibfahn commented Jan 4, 2017

addaleax commented Jan 4, 2017

jbergstroem commented Jan 4, 2017

AndreasMadsen commented Jan 4, 2017

jbergstroem commented Jan 4, 2017

AndreasMadsen commented Jan 4, 2017

mhdawson commented Jan 4, 2017 •

edited

Loading

AndreasMadsen commented Jan 4, 2017

mhdawson commented Jan 4, 2017

Trott commented Sep 2, 2017

Dedicated benchmark server for running Jenkins Job #55

Dedicated benchmark server for running Jenkins Job #55

Comments

AndreasMadsen commented Jan 4, 2017

gibfahn commented Jan 4, 2017

addaleax commented Jan 4, 2017

jbergstroem commented Jan 4, 2017

AndreasMadsen commented Jan 4, 2017

jbergstroem commented Jan 4, 2017

AndreasMadsen commented Jan 4, 2017

mhdawson commented Jan 4, 2017 • edited Loading

AndreasMadsen commented Jan 4, 2017

mhdawson commented Jan 4, 2017

Trott commented Sep 2, 2017

mhdawson commented Jan 4, 2017 •

edited

Loading