Improve testing infrastructure #366

matt-gretton-dann · 2020-06-05T17:06:04Z

This PR improves the testing infrastructure as follows:

Uses CTest more effectively so that we can tests in parallel.
Updates the GitHub Actions CI loop to run tests in parallel.
Adds an test-accept build target which which enables test changes to be accepted using the build system.
Improves documentation.

Importantly these tests can now be run from within Visual Studio (2019 v16.6 tested), and test changes can be accepted using the Visual Studio GUI. (Although I recommend using the command line as it runs the tests faster). Documentation here: https://github.com/mrc-ide/covid-sim/blob/1221aac5772aed8d7ba4b2a8410d809841fb1361/docs/build.md

@dlaydon, @NeilFerguson: This should hopefully improve your experience - but I would appreciate any comments questions you have.

This improves the parallelism of the testing infrastructure. So if the tests are invoked with -j2 up to two tests will run at once. We also note which tests are multi-processor and ensure they request the correct resources.

This enables us to do: `make test-accept` to accept test changes, and means that all testing can be done by invocations of cmake or the underlying build system.

These are now part of make test and make test-accept

ozmorph

Looks great! Are there any theories as to why the Windows regression tests take 2.1-2.4 times longer than any of the Linux or MacOS tests?

For instance, this pull request's regression tests on Windows took about 1 hour 24 minutes. Ubuntu on the other hand took 36 minutes.

If we're sure it isn't Azure/VM related, then I may try diagnosing the problem further.

matt-gretton-dann · 2020-06-17T16:25:02Z

Looks great! Are there any theories as to why the Windows regression tests take 2.1-2.4 times longer than any of the Linux or MacOS tests?

For instance, this pull request's regression tests on Windows took about 1 hour 24 minutes. Ubuntu on the other hand took 36 minutes.

If we're sure it isn't Azure/VM related, then I may try diagnosing the problem further.

I'm not sure whether it is VM related or not. I experience similar slow-downs when running on my Windows VM on my dev machine - and I don't have access to a similarly specced bare-metal Windows machine to compare to my Windows one.

The first thing I would check is whether removing all *printf() calls speed us up (as these turn into file writes and that's always my first suspect), and I wonder if there are any OpenMP performance issues on Windows (I haven't researched this).

ozmorph · 2020-06-17T16:47:59Z

@matt-gretton-dann Thanks for responding!

I thought the same thing initially. I haven't ruled out *printf() calls being the culprit, but I'm doing most of my development on Windows as well and frequently test both on a ext4 partition in WSL2 and on an NTFS filesystem using either Clang 10 or MSVC. To this point, I haven't seen a noticable difference in run-time between the two, which leads me to believe that it's not a file buffering or file-system issue. But I have no evidence to point to the contrary yet.

As you may or may not have seen in #388, I've discovered several locations in the code that benefit greatly (43% run-time reduction for the US test) from some minor alterations to memory storage and access. These have greatly improved the CPU caching behavior for AssignPeopleToPlaces(), at least on my workstation. I've also discovered another great opportunity within InfectSweep() in the past 24 hours that I hope will bring down the run-time even more.

…urther-parallelise-testing

Matthew Gretton-Dann added 8 commits June 5, 2020 11:28

Raise level of parallelism to test invocations

f09f60e

This improves the parallelism of the testing infrastructure. So if the tests are invoked with -j2 up to two tests will run at once. We also note which tests are multi-processor and ensure they request the correct resources.

Check multiple locations for CovidSim executable

09a4c5a

Treat checksum file as text

55968ec

Make CI loop use multi-cores for testing

fc85ed5

Add test-accept target

6cbd793

This enables us to do: `make test-accept` to accept test changes, and means that all testing can be done by invocations of cmake or the underlying build system.

Remove legacy regression testing scripts

85e68c1

These are now part of make test and make test-accept

Improve documentation of testing

828d412

Update build instructions

1221aac

matt-gretton-dann requested review from igfoo, dlaydon and NeilFerguson June 5, 2020 17:06

ozmorph approved these changes Jun 10, 2020

View reviewed changes

ozmorph mentioned this pull request Jun 15, 2020

US Regression Test Optimization #388

Merged

Merge remote-tracking branch 'origin/master' into matt-gretton-dann/f…

6c2fe3d

…urther-parallelise-testing

weshinsley self-requested a review June 18, 2020 10:15

weshinsley approved these changes Jun 18, 2020

View reviewed changes

matt-gretton-dann merged commit effd349 into master Jun 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve testing infrastructure #366

Improve testing infrastructure #366

matt-gretton-dann commented Jun 5, 2020

ozmorph left a comment

matt-gretton-dann commented Jun 17, 2020

ozmorph commented Jun 17, 2020

Improve testing infrastructure #366

Improve testing infrastructure #366

Conversation

matt-gretton-dann commented Jun 5, 2020

ozmorph left a comment

Choose a reason for hiding this comment

matt-gretton-dann commented Jun 17, 2020

ozmorph commented Jun 17, 2020