You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment, some of our unit & integration test suites are not fit to run on the default CI vms which has relatively small memory (7GB). These are usually tests that run our example notebook end-to-end with a large dataset (ranging from 100k to 20million rows). As a result, we can only run them on self-hosted VM with larger memory. Does the team think that there is a need to redesign these tests/notebooks so they runnable on GitHub-hosted VMs? @miguelgfierro, @gramhagen?
I think there are some solutions we can go after, besides optimizing the content in the notebook:
Incomplete training using a smaller dataset, and relax the assertion checks.
assert results["rmse"] == pytest.approx(0.8621, rel=TOL, abs=ABS_TOL)
for example, instead of checking if rsme is within some error range within a target value, we can check that our notebook runs successfully with a smaller dataset, and rsme "improves" or is not none.
Avoid running the full notebook multiple times for different data-size.
Expected behavior with the suggested feature
Tests should meet the resource constrains in a Standard_DS2_v2 SKU so they are runnable in such machine.
This has the following benefits:
CI is less dependent on self-hosted resource so we can scale up better when we support more python versions
Reduce build time significantly
Faster iteration
The devil's advocate:
Do we care how long our nightly build would take to run?
Are we comfortable with relaxing some of the assertions in the tests and risk notebook failing for users?
Is it even possible to run some of the notebooks e2e under 7GB memory (minus whatever it is needed to keep the OS running)
Other Comments
The text was updated successfully, but these errors were encountered:
This method is a more complete way of testing ML pipelines than just checking that the code works, we also want to make sure that the ML models provide metrics that are reasonable. That's why we wanted to use a relatively big dataset in the tests.
the test configuration is very thorough, but i think we can restrict the smoke and integration tests to running less frequently (nightly, weekly?).
also, the notebook tests are not really unit tests since they involve multiple components interacting with each other.
i think we should be aiming for unit tests that exercise individual components and don't use external data this will make them very fast and less likely to fail if a website goes down. all the other types of tests can be moved to smoke/integration and run nightly.
Description
At the moment, some of our unit & integration test suites are not fit to run on the default CI vms which has relatively small memory (7GB). These are usually tests that run our example notebook end-to-end with a large dataset (ranging from 100k to 20million rows). As a result, we can only run them on self-hosted VM with larger memory. Does the team think that there is a need to redesign these tests/notebooks so they runnable on GitHub-hosted VMs? @miguelgfierro, @gramhagen?
Link to integration test failures on Github-hosted VM
Link to unit test failures on Github-hosted VM
I think there are some solutions we can go after, besides optimizing the content in the notebook:
assert results["rmse"] == pytest.approx(0.8621, rel=TOL, abs=ABS_TOL)
for example, instead of checking if rsme is within some error range within a target value, we can check that our notebook runs successfully with a smaller dataset, and rsme "improves" or is not none.
Expected behavior with the suggested feature
Tests should meet the resource constrains in a Standard_DS2_v2 SKU so they are runnable in such machine.
This has the following benefits:
The devil's advocate:
Other Comments
The text was updated successfully, but these errors were encountered: