Skip to content

Thoughts on Continuous Integration

Javier Gonzalez edited this page Jan 10, 2020 · 11 revisions

The "pedestrian" way of doing continuous integration CI would be to have cron jobs in several machines, which continuously check the repositories (or handle a signal from a Github action) to trigger the tests and builds of individual conda packages, build documentation, and copy the resulting files in a central repository. However, there are multiple tools to make our lives easier.

Using Github

What I am trying to do is to offload as much of that work to Github. I started trying to use Github actions and docker for continuous integration. The proof of concept is here: https://github.com/sot/ska_pkg_builder_action. This uses a CentOS 5 docker image for building a specific ska3 Conda package. It uses the Conda channels in http://cxc.cfa.harvard.edu/mta/ASPECT/ska3-conda-test

The current version has the following issues:

  • Github action needs to be public to access multiple repositories
  • It currently builds one package, but it should just build the current one)
  • It should use a Github token for authentication (need to make sure this works in all our cases)
  • I don't know yet how the build would be on MacOS
  • After building, the package needs to be placed somewhere. Where? (artifact mechanism is useless: https://gh.neting.ccmunity/t5/GitHub-Actions/Delete-artifacts/td-p/38188, https://github.com/actions/download-artifact/issues/3)
  • And how does testing proceed after that?
  • The same setup could be used for testing, but then one needs a full ska3-flight image, which is ~4GB in size.

Using Google Drive

One of our issues is to share the output from github actions (or any automated build) and the workers downstream. As mentioned above, github action's artifacts are pretty useless right now. One option is to use Google drive, in which case the main issue is authentication. There are two options:

To do the later, I did the following:
  • created a project called "cxc-ska3-ci" on Google Cloud Platform.
  • created a service account "ska-builder" within the project
  • created a folder within my own Google drive
  • shared the folder with the service account. NOTE: I created the project within the "cfa.harvard.edu" organization, and used "cfa.harvard.edu" as the location. Still, when I shared the folder with the service account, I got the following message:
You are sharing to ska-builder@cxc-ska3-ci.iam.gserviceaccount.com who is not in the G Suite organization that this item belongs to

Options for Continuous Integration

- Pros Cons
Custom
  • on-premises
  • Maintenance is involved (multiple crontabs, test platforms/nodes, etc)
  • Need to regularly query github for news, establish a workflow, logging, dashboards, etc.
    (basically reinvent Jenkins)
Buildbot
  • on-premises,
  • waterfall dashboard
  • Maintenance: need to maintain server and workers
Jenkins
  • on-premises,
  • Easily configurable and customizable
  • workers for arbitrary platforms possible
  • Need to maintain server and workers,
  • server and workers need Java
Github + custom
  • some on-premise (unit test, build, docs),
  • some hosted (integ. test)
  • docs included
  • Need to define how github interacts with on-premise solution,
  • no dashboard
Github self-hosted
  • Integrated with Github
Github
  • Simpler, almost no maintenance,
  • docs included
  • cost (anything above 2GB costs money)
  • skare3 builder image is 1.7 GB, full ska3-flight is 4GB
  • not sure how it will be on MacOS/Windows
Clone this wiki locally