Overarching design for new functional testing #38

fcooper8472 · 2021-01-26T23:14:09Z

Some thoughts, very much open to discussion. Most relevant to @MichaelClerx @martinjrobins @ben18785.

Move the actual functional tests to the main PINTS repo. This will ensure all PINTS specific code is version controlled with PINTS, not somewhere else. I would imagine this being a new top level directory in the PINTS repo. This also means a single commit hash gives you a complete view into PINTS + functional testing state at once.
Functional testing repo would just contain code for running those tests, helping to separate code that uses PINTS from the infrastructure that runs the functional test.
Use GitHub pages to host a new (hugo) website for showing results, using this template: each model (mcmc_banana, nested_normal, etc) would have top-level navigation sections, with specific tests (mcmc_banana_DifferentialEvolutionMCMC, mcmc_banana_DreamMCMC, ...) being individual pages containing plots.
Frontpage would include an overview of any currently failing tests (with links).
Outputs could be customised on a per-test basis because each test would write to a dedicated table in the sqlite database: presumably functional testing would pass a database connection to the test. This eliminates the need for a hacky random JSON object, and instead each test would have a nice simple flat data table containing all relevant info.
Plots would use altair, would be interactive, and allow clicking through to commits, hovering over points, etc.
Each test would be responsible for plotting: for instance by returning a list of vega-lite specifications.
The tests could be run on skip, via GitHub actions: each run would cause a fresh checkout of functional testing and PINTS, and would refer to a hardcoded database location (which will grow and is not suitable for versioning).

Anyone have any major thoughts/comments on this as a basic structure?

ben18785 · 2021-01-26T23:24:09Z

Thanks @fcooper8472 -- that all sounds good to me. The one thing I'm less sure about is having a separate table for each test since I see a lot of overlap between the outputs of them. That said, if having the tables separately means that it'll be easier to add tests that return quite different measures, then perhaps this is easiest.

fcooper8472 · 2021-01-26T23:31:35Z

Yep, there's a lot of overlap, e.g. git commit hash, but at the moment everything's just shoved into a single table as extra lines, so you're not storing any extra data having different tables.

I would imagine having some kind of payload object with all the overlapping fields filled in by the base class to enforce some kind of extensible uniformity. Haven't thought through all the details yet though.

MichaelClerx · 2021-01-26T23:33:49Z

Thanks @fcooper8472 ! Does sound very good!

The only bit I'm unsure about is the first point:

Will tests not require packages not included in PINTS proper? E.g. vega, or some form of db access? Or if they don't do any db writing, then presumably they need to implement an interface from the functional testing package?
Some other things look pints-specific to me e.g. knowing where to look for tests, knowing the mcmc_ scheme, knowing what the table looks like?

Or are you thinking these will all be "settings" in PINTS installation of the FT project?

MichaelClerx · 2021-01-26T23:34:25Z

(Incidentally, I wouldn't do away with storing the commit hash of FT. Still keep it as meta-data. But yeah having it as a dual key system was probably overkill.)

MichaelClerx · 2021-01-26T23:36:41Z

It would greatly add to the value of FT, I agree, if it was something you could readily add to any project :D

fcooper8472 · 2021-01-31T16:39:14Z

Having though about this some more, here is my next iteration of thoughts:

Let's keep the tests separate from PINTS
Functional testing needs to be run from the PINTS repo, or there is no sane way to run it on just push to master on PINTS. This would mean writing a functional testing GitHub workflow on PINTS that just checks out functional-testing, with a token if necessary to push changes
I want to try and let GitHub run the functional tests (rather than running them on Skip). Each run can go for up to 6 hours on GH and I think if we're not well inside that we're probably doing something wrong!
Let's stop over-thinking how we store results. Instead of having a database that sits somewhere that we have interact with, let's just keep a data directory in functional testing and the results in plain text csv files that get versioned with functional testing. Each test will have a csv file with whatever columns make sense for that test. Common information between tests (commit hash, seed etc) can be in a main csv, and referencing between can be via commit hash.
Each run of functional testing, run from PINTS, will checkout functional-testing, run the tests, add (currently 4) rows to the csv files, rebuild the hugo website, and push all the changes.

The main problem I'm having now is navigating the existing functional testing code. I just cannot understand how it's supposed to work. I think you'll have to give me a tutorial @MichaelClerx.

ben18785 · 2021-01-31T16:58:16Z

Sounds really good Fergz! I like the idea of it running via workflows.

…

On Sun, Jan 31, 2021 at 4:39 PM Fergus Cooper ***@***.***> wrote: Having though about this some more, here is my next iteration of thoughts: - Let's keep the tests separate from PINTS - Functional testing needs to be run from the PINTS repo, or there is no sane way to run it on just push to master on PINTS. This would mean writing a functional testing GitHub workflow on PINTS that just checks out functional-testing, with a token if necessary to push changes - I want to try and let GitHub run the functional tests (rather than running them on Skip). Each run can go for up to 6 hours on GH and I think if we're not well inside that we're probably doing something wrong! - Let's stop over-thinking how we store results. Instead of having a database that sits somewhere that we have interact with, let's just keep a data directory in functional testing and the results in plain text csv files that get versioned with functional testing. Each test will have a csv file with whatever columns make sense for that test. Common information between tests (commit hash, seed etc) can be in a main csv, and referencing between can be via commit hash. - Each run of functional testing, run from PINTS, will checkout functional-testing, run the tests, add (currently 4) rows to the csv files, rebuild the hugo website, and push all the changes. The main problem I'm having now is navigating the existing functional testing code. I just cannot understand how it's supposed to work. I think you'll have to give me a tutorial @MichaelClerx <https://github.com/MichaelClerx>. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABCILKD7Y3MSR6OICUBM3V3S4WBT7ANCNFSM4WUIUPMA> .

MichaelClerx · 2021-02-01T09:52:09Z

Thanks Fergus!
I'm a bit hesitant to go back to CSVs, after we changed from CSV (one per test run) to a DB, but perhaps one file per test is more workable. Ideally we'd still be able to combine results from multiple nodes though, but we can maybe just use a hostname in each file or something, and then load multiple files in (if available) during analysis?

Re: tour. Happy to! But the current code needs some heavy reworking anyway, I find any time I touch it :D

MichaelClerx · 2021-02-01T10:01:31Z

I'm free after 11 today or else we can do it tomorrow morning?

iamleeg · 2021-02-01T10:01:40Z

Let's stop over-thinking how we store results.

In defence of my original level of thinking, the point of the database was to be able to correlate test results longitudinally so that statistical measures of tests that relied on some random input could be obtained. You could achieve that with a collection of CSVs keyed by git hash, though it'd be more work to use the git tree and CSV files to reconstruct the history.

fcooper8472 · 2021-02-01T10:15:18Z

The main problem I'm suggesting we try to solve is that we currently have a database that needs to exist somewhere.

If we want to run a test on GitHub actions, that has to be somewhere that we can read from and write to. Any every physical machine that might want to run tests will need access to wherever the database is kept.

We could stick the database in GitHub, but it will change every run and need to be stored entirely every time. So it seems like some plaintext format is the way to go: we would only be versioning the next set of results.

I'm not very familiar at all with databases: what kind of operations are you thinking of that are better suited to a database than, say, csv + pandas?

iamleeg · 2021-02-01T10:26:11Z

Like I say, you'll be able to do it in either, but if you want to correlate test results across runs either way then you'll have to have a location for writable storage available between runs whatever you're storing.

martinjrobins · 2021-02-01T10:28:49Z

doesn't have to be a fancy database even, could even be a google/MS spreadsheet(s)?

fcooper8472 · 2021-02-01T10:35:22Z

The problem isn't whether it's SQLite vs excel vs google sheets, it's whether we can read from and write to whatever that source is easily.

I'm suggesting versioning the data with the functional testing repo is the obvious (and simplest) solution. But I'm very much open to suggestions if there's a simple (& free) way of doing it another way.

MichaelClerx mentioned this issue Feb 9, 2021

Branch for functional testing design pints-team/pints#1280

Closed

MichaelClerx mentioned this issue Aug 11, 2021

Re-organisation/tidying of functional tests submodule pints-team/pints#1386

Merged

MichaelClerx mentioned this issue Nov 7, 2023

Testing roadmap pints-team/functional-testing-2#14

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overarching design for new functional testing #38

Overarching design for new functional testing #38

fcooper8472 commented Jan 26, 2021

ben18785 commented Jan 26, 2021

fcooper8472 commented Jan 26, 2021

MichaelClerx commented Jan 26, 2021

MichaelClerx commented Jan 26, 2021 •

edited

Loading

MichaelClerx commented Jan 26, 2021

fcooper8472 commented Jan 31, 2021

ben18785 commented Jan 31, 2021 via email

MichaelClerx commented Feb 1, 2021

MichaelClerx commented Feb 1, 2021

iamleeg commented Feb 1, 2021

fcooper8472 commented Feb 1, 2021

iamleeg commented Feb 1, 2021

martinjrobins commented Feb 1, 2021

fcooper8472 commented Feb 1, 2021

Overarching design for new functional testing #38

Overarching design for new functional testing #38

Comments

fcooper8472 commented Jan 26, 2021

ben18785 commented Jan 26, 2021

fcooper8472 commented Jan 26, 2021

MichaelClerx commented Jan 26, 2021

MichaelClerx commented Jan 26, 2021 • edited Loading

MichaelClerx commented Jan 26, 2021

fcooper8472 commented Jan 31, 2021

ben18785 commented Jan 31, 2021 via email

MichaelClerx commented Feb 1, 2021

MichaelClerx commented Feb 1, 2021

iamleeg commented Feb 1, 2021

fcooper8472 commented Feb 1, 2021

iamleeg commented Feb 1, 2021

martinjrobins commented Feb 1, 2021

fcooper8472 commented Feb 1, 2021

MichaelClerx commented Jan 26, 2021 •

edited

Loading