Framework: Improve Travis build performance #14289

aduth · 2019-03-06T16:17:54Z

This pull request seeks to explore a few ideas for improved Travis build performance:

Remove postinstall script from package.json
- Most of our Travis jobs ultimately call npm run build twice, wastefully, because of the combination of the postinstall triggered by an initial npm install, and a subsequent explicit build
- npm install is run in many separate jobs, but npm run check-licenses only really needs to be run at most one time, not in each job
Removes npm / JavaScript build from the "PHP Unit Tests" job
- Avoids lengthy install / build process
- As best I can tell, there should be no need to rely on JavaScript produced by npm run build. Previously it may have been needed to use the npm scripts for npm run test-php and npm run test-unit-php-multisite, but these resolve to docker-compose commands anyways, so it seems reasonable enough to call them directly
  - As of the current proposal of Blocks: Re-register core blocks via build copy prefixing #13521, there is some PHP produced as the result of the Webpack build, though it is not subject to tests
Runs "JS unit tests" job script tasks concurrently
- With immediate bail-out if any of the tasks fail
- None of these tasks has any dependency on any of the others
- Arguably, this could make the build results even more difficult to interpret, since concurrent output will be mixed from each of the underlying tasks (related Slack conversation).

Testing instructions:

Verify a passing build.

In review, assure there is no lack of coverage for what had previously been tested.

Include check-licenses

nerrad · 2019-03-06T16:25:04Z

.travis.yml

+          "npm run lint" \
+          "npm run check-local-changes" \
+          "npm run check-licenses" \
+          "npm run test-unit -- --ci --maxWorkers=2 --cacheDirectory=\"$HOME/.jest-cache\""


I'd be interested in seeing what putting npm run lint, npm run check-local-changes, and npm run check-licenses into their own job adds to the build times. I don't like the idea of doing concurrently here because it means for someone relying on travis has no insight into whether the subsequent tasks would pass or not.

I'd be interested in seeing what putting npm run lint, npm run check-local-changes, and npm run check-licenses into their own job adds to the build times.

Yeah, this is what prompted my most recent thread comments in Slack. I agree that ideally they'd be defined as separate jobs, but it's not clear how to do that without increasing build times (kinda contrary to the point of the pull request).

On the general topic of "making sense of the build output", there might be some other options to explore for organizing the results in a way which is more easy to read. Right now, the concurrent output is a bit of a mess, since it just spews everything to stdout in the order it's received (i.e. parallel scripts each with their own outputs are intermixed). I think it comes down to: Can we leverage / find new / develop the tool to organize the output in a more readable fashion?

I don't like the idea of doing concurrently here because it means for someone relying on travis has no insight into whether the subsequent tasks would pass or not.

I guess it depends if it's more in the interest of the developer to have faster build times, or a more thorough report of every issue in the build if there are multiple issues. I don't really have a strong leaning one way or the other, but I'd be inclined to think the former would benefit more people in the vast majority of cases.

Have you tested with --maxWorkers=4? Travis VMs have two CPUs available, so it's usually a good idea to have more workers than that, so the CPUs are fully utilised: the workers will likely have I/O wait time to deal with.

Assumed previously relied on postinstall

Slated for removal as of #13569

pento · 2019-03-06T21:02:04Z

I added check-licenses to post-install, so folks couldn't accidentally add a new package, start developing against it, then only discover that it caused licensing issues after they pushed a branch to GitHub.

If it's more trouble than it's worth, I'm fine with moving it somewhere that isn't run as often. It really only needs to run once, in a single Travis job.

aduth · 2019-03-06T21:12:43Z

I added check-licenses to post-install, so folks couldn't accidentally add a new package, start developing against it, then only discover that it caused licensing issues after they pushed a branch to GitHub.

Yeah, I thought it might be the case. Ideally there would be a separate script hook specifically when installing with arguments (i.e. npm install some-package vs. just npm install). I'd looked for this, but couldn't find much. I assume the lack of a distinction is partly intentional.

There's quite a few instances in the setup which optimize developer experience, which is great, but I think they ought to be isolated from what's run in Travis. Another one I've been seeing is that we still run npm install in the PHP unit tests because it's part of the setup-local-env.sh script pulled in by run-wp-unit-tests.sh.

aduth · 2019-03-06T21:31:59Z

Analyzing the current results using another build as baseline:

Before: https://travis-ci.com/WordPress/gutenberg/builds/103420242
After: https://travis-ci.com/WordPress/gutenberg/builds/103415128

The difference in "PHP unit tests (PHP 5.2)" is significant, from 3m49s to 51s (77.7% decrease)
- Expected from avoiding npm install and npm run build
- As a result of the Remove PHP project, there are very few remaining PHP unit tests (currently)
- The same decrease is not observed in "PHP unit tests (Docker)" for the reason noted in Framework: Improve Travis build performance #14289 (comment) (we run npm install in the Docker setup process)
"JS unit tests", "PHP unit tests (Docker)", "E2E tests (Admin with plugins) (1/2)", "E2E tests (Author without plugins) (1/2)" are each about a minute shorter
- This appears consistent with my own measurement of npm run build taking approximately one minute, if we've now removed the duplication of npm run build
- It's not immediately clear to me why "E2E tests (Admin with plugins) (2/2)" and "E2E tests (Author without plugins) (2/2)" are not impacted
None of this really matters † when the build takes as long to complete as its lengthiest job, and the longest E2E tests still takes the same 13 minutes it did previously

† Expanding on this last point:

It may matter in the sense of freeing up containers, if we're constrained by some limited pool of workers available in Travis (I admit to not knowing too well how this works).
E2E tests are already split between two jobs. Should this be reevaluated?
- If it takes 12 minutes per job when split across two jobs, is it a fair approximation that it'd take 8 minutes per job when split across 3 jobs, or 6 minutes per job when split across 4 jobs?
- Is there any reason to be conservative with how we allocate new jobs? (Relating to the earlier point about available Travis workers)

I'm also a bit curious if these need to be run as separate containers, or if we could instead parallelize the end-to-end tests within a single container:

An advantage being that we only need to run npm run install and npm run build once for all end-to-end tests, rather than per-forked-job
Puppeteer advertises parallelization as a key feature (reference)
- Puppeteer v1.5.0 also exposes browser contexts, making it possible to efficiently parallelize test execution.
A problem with parallelizing E2E test execution is the potential for test suites to conflict with one another (e.g. one suite asserts with an assumption it creates the only post on a site, while in-fact another test might create its own at the same time)
- Maybe worth exploring leveraging multi-site to create isolated environments for each test suite, where each suite corresponds to its own site in a network of sites running in the same container
- Or: Consider these types of assumptions an anti-pattern which be avoided (I sense this could be prone to future test fragility)

gziolo

I'm also a bit curious if these need to be run as separate containers, or if we could instead parallelize the end-to-end tests within a single container:

I don't know the architecture of Travis containers, but I would expect that each container has max two cores available. I assume it based on the unit tests setup which uses two workers at the moment. I also know that Circle CI had a similar limitation in the past.

gziolo · 2019-03-07T14:00:48Z

bin/run-wp-unit-tests.sh

@@ -32,8 +29,9 @@ fi

 # Run PHPUnit tests
 if [[ $DOCKER = "true" ]]; then
-	npm run test-php || exit 1
-	npm run test-unit-php-multisite || exit 1
+	docker-compose $DOCKER_COMPOSE_FILE_OPTIONS run --rm composer run-script lint


What's the reason for calling those commands directly rather than continue using npm run *?

What's the reason for calling those commands directly rather than continue using npm run *?

I guess I assumed if we weren't installing dependencies, we shouldn't run npm scripts. In retrospect, it's probably not strictly a problem. Overall, it's part of a move of "we don't need NPM (or even Node) at all here", maybe even opening up future possibilities to avoid having it be installed in the environment at all.

Sounds good, should we keep those npm scripts moving forward? There might be value in it but I'm afraid that at some point they will diverge from what is added to Travis config.

Sounds good, should we keep those npm scripts moving forward? There might be value in it but I'm afraid that at some point they will diverge from what is added to Travis config.

Agreed on the concern. I know I've used the NPM script variants, likely out of convenience / familiarity over what would be the docker-compose equivalent, but I'm not overly compelled that they should stay. There's some nice uniformity on the developer experience front to have all test commands available in a single location. The Travis environment is the exception here.

aduth · 2019-03-07T20:01:12Z

I don't know the architecture of Travis containers, but I would expect that each container has max two cores available.

You're right, it is two cores:

https://docs.travis-ci.com/user/reference/overview/#virtualisation-environment-vs-operating-system

To be embarrassingly candid, I'm not sure I understand how the upper bound of cores translates to what types and extent of parallelization we can implement. For example, if it's possible to run multiple Puppeteer tabs in parallel in a single Node process (single core?).

Regardless, there may be some benefit to parallelizing inside the container, albeit maybe not as much as we might have hoped. It may not be worth the effort, though it could also depend how easy or difficult it is to implement. For example, I found some projects which exist in the ecosystem to simplify it:

https://github.com/thomasdondorf/puppeteer-cluster

In any case, would you think a more immediate solution might be to further fork the end-to-end tests across 3 or 4 tasks, vs. the current two?

pento · 2019-03-07T23:58:53Z

Apart from testing a higher number of workers, it's quite reasonable to split testing over multiple jobs. We can have up to 15 concurrent jobs at one time across the entire WordPress org, we rarely hit that limit.

It's a pity build stages don't let us do all of the environment setup first, then copy that across multiple VMs, so we can avoid re-running npm install / npm install build. That's 4-5 minutes that we can't avoid at the start of each e2e job.

Something we could explore is setting up a custom Docker image that does all of that setup work. Then it'd just be rebuilt whenever packages change.

aduth · 2019-03-08T00:09:57Z

It's a pity build stages don't let us do all of the environment setup first, then copy that across multiple VMs, so we can avoid re-running npm install / npm install build. That's 4-5 minutes that we can't avoid at the start of each e2e job.

Yeah, there was some related conversation about this on Slack. It's certainly not as straight-forward as I wish it were, but Travis does promote an option of publishing files to S3 in an early stage which are then pulled into later stages for re-use. It's not entirely clear if / how much of a positive benefit this would have, particularly considering the overhead involved in archiving and transmitting a folder containing node_modules. I might imagine then we'd get the most benefit from using it only for the result the npm run build.

ntwb · 2019-03-08T02:53:13Z

Yesterday I also checked out the stages docs once again, and yeah a shame we can't use it to host our environment.

That said, a quick look at x.1 job, npm ci is taking ~24 seconds.

I'm wondering if we can use Travis CI's cache functionality here someway, maybe archive that folder into a folder and added to the Travis CI cache section...

gziolo · 2019-03-08T11:02:07Z

There is also previous work started by @noisysocks where he tries to add caching for building packages in #13124. With the changes proposed in this PR, it seems like caching packages might be less important for the initial Travis run. However, it might be still beneficial for the follow-up runs where new commits are added to PRs.

aduth · 2019-03-13T15:07:43Z

FYI: postinstall removal landed separately in #14353

I'm considering this more a directional / experimental pull request, though perhaps it can land after a few iterations and review.

Following-up on E2E parallelization: I think it'd be enough for the short-term to split it into more containers. Unsure if I want to explore it here, but I think also we should:

Leverage the two cores per container to the extent possible, parallelizing E2E within each job
Reuse the build distributable across all the E2E jobs. It's wasted time to be running npm run build in each of what will soon become 6 (or 8) separate E2E jobs.

I'm working on a side-project which, while not directly related to this effort, could provide a plugin ZIP distributable for a given pull request at its latest HEAD SHA, to be used for this purpose. Or, alternatively, I think we lack some coverage in verifying that our "package plugin" step works correctly, and it could be a good opportunity to implement this as a startup job for the build to serve the dual-purpose of packaging coverage, and for making the build available for each of the subsequent jobs. This does require an S3 (or equivalent) integration, however.

I want to tinker as well with splitting "JS unit tests" into individual jobs for what currently comprises npm run lint, npm run check-local-changes, npm run check-licenses, and npm run test-unit. It's a tricky balance here if each will continue to need to run its own npm ci independently, but aside from obvious developer-experience gains in a cleaner output, there may actually be some performance benefit to freeing up containers for some of these tasks which don't require npm run build to have occurred (lint, check-local-changes, check-licenses).

nerrad · 2019-03-13T15:13:13Z

could provide a plugin ZIP distributable for a given pull request at its latest HEAD SHA, to be used for this purpose

This would also be good because e2e tests will be aligning more closely to a "traditional" plugin installation, having them all start from the same plugin zip would be good.

aduth · 2019-03-13T15:14:03Z

Oh, and more: This and my side-project have made me consider what we can be doing to reduce the time it takes for npm run build to complete.

A few high-level thoughts occur to me:

The build encompasses a separate Babel and Webpack processing (source). Ideally this could be consolidated. I thought as well that at least for the purpose of generating the plugin, we could do only the Webpack step. However, this doesn't work because the Webpack build itself relies on some packages to have been built (source). It may be an option to direct the built dependencies to the one published to NPM, though this seems like it could introduce some fragility in broken expectations of these modules being sourced from the project directory.
Babel caching. I've not yet explored it deeply (though have vague recollections of encountering it in the past), but the Webpack Babel loader provides a cacheDirectory option which can be used to help speed up builds (reference). It may be enough to add this as cache.directories in the Travis configuration, pointing the Babel loader to it as configured through some environment variable (if present, e.g. not for typical developer environments, unless opted-in).

I should also note that these notes are a brain dump because, with other priorities, I've not been able to dedicate time yet to revisit this in more detail.

aduth · 2019-03-13T15:19:58Z

This would also be good because e2e tests will be aligning more closely to a "traditional" plugin installation, having them all start from the same plugin zip would be good.

Yes, definitely. It's a bit scary to think that we have all this testing in place, but what's distributed to users is whatever is produced by this packaging shell script, which is not necessarily the same as what runs in the testing environments (notably by the last step: choosing which files to include in the ZIP).

gziolo · 2019-03-14T13:32:47Z

.travis.yml

-        - npm run lint
-        - npm run check-local-changes
-        - npm run test-unit -- --ci --maxWorkers=2 --cacheDirectory="$HOME/.jest-cache"
+        - npm run build


In #14432 I'm proposing changes which would ensure that we always use the source code to run unit tests. If that change lands, we could do one of the following:

remove npm run build altogether from this job

add a flag which will allow Jest config to remove override which ensures that build folders aren't used

As far as I remember, if we do (2) we would have quite good coverage for all our codebase both transpiled with Babel and original source code. In development, we would always use source code and on Travis we would use setup closer to what you have when using code from npm.

aduth · 2019-04-24T19:57:44Z

While I do plan to continue the effort here, I don't plan to use this pull request itself, and discussion would be better served and tracked at an issue anyways. With that in mind, I'll close this and have opened #15159 to track ongoing work here.

aduth added 3 commits March 6, 2019 10:58

Framework: Run Travis JS Unit Test tasks concurrently

33cd071

Include check-licenses

Framework: Remove postinstall script

ef23f7b

Framework: Avoid npm install / build in PHP unit tests

f4c926a

aduth added Framework Issues related to broader framework topics, especially as it relates to javascript [Type] Build Tooling Issues or PRs related to build tooling labels Mar 6, 2019

aduth requested review from youknowriad, pento, noisysocks, gziolo and nerrad March 6, 2019 16:17

aduth requested review from ajitbohra and ntwb as code owners March 6, 2019 16:17

nerrad reviewed Mar 6, 2019

View reviewed changes

aduth added 6 commits March 6, 2019 11:25

Framework: Run build for JS unit tests

d69d056

Assumed previously relied on postinstall

Testing: Ensure directory exists before writing in ensure_editor_styles

732f500

Testing: Remove core-block-theme-test.php

651d562

Slated for removal as of #13569

Testing: mkdir recursive

7f31ee4

Testing: mkdir the dir, not the path

da40885

Testing: Fix docker-compose run arg order

6379756

gziolo reviewed Mar 7, 2019

View reviewed changes

aduth mentioned this pull request Mar 9, 2019

Plugin: Remove postinstall step #14353

Merged

gziolo reviewed Mar 14, 2019

View reviewed changes

aduth mentioned this pull request Mar 19, 2019

Blocks: Re-register core blocks via build copy prefixing #13521

Merged

aduth mentioned this pull request Apr 24, 2019

Framework: Travis build performance #15159

Closed

14 tasks

aduth closed this Apr 24, 2019

aduth deleted the update/travis-perf branch April 24, 2019 19:57

This was referenced Apr 26, 2019

Build Tooling: Increase number of Travis E2E test jobs #15228

Merged

Build Tooling: Split JavaScript tasks to individual jobs #15229

Merged

Build a plugin zip of the PR for testing #9712

Closed

aduth mentioned this pull request Jul 10, 2019

Remove the use of popular plugins in e2e tests #15940

Merged

aduth mentioned this pull request Mar 9, 2020

Build: Include entire lib/ folder in plugin bundle #20736

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Framework: Improve Travis build performance #14289

Framework: Improve Travis build performance #14289

aduth commented Mar 6, 2019

nerrad Mar 6, 2019

aduth Mar 6, 2019

pento Mar 7, 2019

pento commented Mar 6, 2019

aduth commented Mar 6, 2019

aduth commented Mar 6, 2019 •

edited

Loading

gziolo left a comment

gziolo Mar 7, 2019

aduth Mar 7, 2019

gziolo Mar 8, 2019

aduth Mar 11, 2019 •

edited

Loading

aduth commented Mar 7, 2019

pento commented Mar 7, 2019

aduth commented Mar 8, 2019

ntwb commented Mar 8, 2019

gziolo commented Mar 8, 2019

aduth commented Mar 13, 2019

nerrad commented Mar 13, 2019

aduth commented Mar 13, 2019 •

edited

Loading

aduth commented Mar 13, 2019 •

edited

Loading

gziolo Mar 14, 2019 •

edited

Loading

aduth commented Apr 24, 2019

Framework: Improve Travis build performance #14289

Framework: Improve Travis build performance #14289

Conversation

aduth commented Mar 6, 2019

nerrad Mar 6, 2019

Choose a reason for hiding this comment

aduth Mar 6, 2019

Choose a reason for hiding this comment

pento Mar 7, 2019

Choose a reason for hiding this comment

pento commented Mar 6, 2019

aduth commented Mar 6, 2019

aduth commented Mar 6, 2019 • edited Loading

gziolo left a comment

Choose a reason for hiding this comment

gziolo Mar 7, 2019

Choose a reason for hiding this comment

aduth Mar 7, 2019

Choose a reason for hiding this comment

gziolo Mar 8, 2019

Choose a reason for hiding this comment

aduth Mar 11, 2019 • edited Loading

Choose a reason for hiding this comment

aduth commented Mar 7, 2019

pento commented Mar 7, 2019

aduth commented Mar 8, 2019

ntwb commented Mar 8, 2019

gziolo commented Mar 8, 2019

aduth commented Mar 13, 2019

nerrad commented Mar 13, 2019

aduth commented Mar 13, 2019 • edited Loading

aduth commented Mar 13, 2019 • edited Loading

gziolo Mar 14, 2019 • edited Loading

Choose a reason for hiding this comment

aduth commented Apr 24, 2019

aduth commented Mar 6, 2019 •

edited

Loading

aduth Mar 11, 2019 •

edited

Loading

aduth commented Mar 13, 2019 •

edited

Loading

aduth commented Mar 13, 2019 •

edited

Loading

gziolo Mar 14, 2019 •

edited

Loading