Skip to content

Improved test system to cover activitysim use cases

Ben Stabler edited this page Mar 4, 2021 · 23 revisions

Work-in-progress

Purpose and Need

The purpose of this improvement to ActivitySim is to develop a solution that provides additional assurances that future updates to ActivitySim will more easily work for existing users and their use cases. This is in response to task 6 prototype multiple models test system.

Examples

There are two types of examples:

  • Test examples - these are official ActivitySim maintained and tested examples. The current test examples are mtc, estimation, and multizone.
  • Agency examples - these are agency partner model implementations registered with ActivitySim as examples. The current agency examples are psrc, arc, semcog. There are two versions of agency examples:
    • Cropped - a subset of households and zones for efficient / portable running. This setup can really only be used to test the software since model results are difficult to compare to observed/known answers.
    • Full - the full scale data setup. This setup can be used to test the model since model results can be compared to observed/known answers.

Testing

The testing plan for test examples versus agency examples is different:

  • Test examples test ActivitySim features, stability, components, etc. This set of tests is run by our TravisCI system and is a central feature of our software development process.
  • Agency examples include two simple tests:
    • Run the cropped version from start to finish to ensure it runs and the results are the same (a regression test).
    • Run the full scale example and produce summary statistics of model results to validate the model. A good starting point for the summary statistics validation script is trips by mode and zone district.

Storage

Both types of examples are stored in GitHub repositories for version control and collaborative maintenance. There are two storage locations:

  • The activitysim package example folder - this stores the example setup files, cropped data, regression test script, expected results, example cropping script, change log, etc.
  • The activitysim_resources repository - this stores just the full scale example data inputs using Git LFS. This two-part solution allows for the main activitysim repo to remain relatively lightweight, while providing an organized and accessible storage solution for the full scale example data. The example_manifest.yaml maintains a dictionary of all the examples and how to get them and run them.

Updates

When a new version of the code is pushed to develop:

  • The core test system is run and code/examples updated as needed to ensure the tests pass
  • If an agency example previous ran without future warnings (i.e. is up-to-date) then we will ensure it remains up-to-date
  • If an agency example previously threw future warnings (i.e. is not up-to-date) then we will not update it

When an agency wants to update their example:

  • It is important to keep the agency examples up to date to minimize the cost/effort of updating to new versions of ActivitySim
  • Agencies have some time (like 3-6 months) to update their example through a pull request.
  • This pull request changes nothing outside their example folder.
  • The test/cropped example must run without warnings.
  • The full scale version is run elsewhere and must pass the validation script.

When an agency example includes new submodels and/or contributions to the core that need to be pulled/accepted:

  • The agency example must be up-to-date with the latest develop version of the code
  • The agency example must include a test/cropped example that implements the two tests and the tests must pass
  • The full scale version must be run elsewhere and must pass the validation script
  • The new submodels and/or contributions to the core will be reviewed by the repository manager (and it's likely some revisions will be required for acceptance)
  • Key items in the review include python code, documentation, and testable examples for all new components

ARC example in more detail

Running the System

The system is currently run by hand - i.e. manually - since it may involve getting and running several large examples that take many hours to run. The system could be fully automated, and either run in the cloud (on AWS for example) or on a local server (on a bench contractor server for example).

System Costs

There are non-trivial costs associated with multiple aspects of developing and supporting agency examples:

  • Computing time and persistent storage costs
  • Labor costs to develop the automated system
  • Labor costs to manually run the system until an automated version has been deployed

How should support for agency examples be paid for? Some options are:

  • Included with ActivitySim membership
  • An additional optional fee beyond ActivitySim membership
  • A third-party vendor supplies the service
Clone this wiki locally