Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: tracking array API test suite compliance #462

Open
kgryte opened this issue Jul 7, 2022 · 1 comment
Open

RFC: tracking array API test suite compliance #462

kgryte opened this issue Jul 7, 2022 · 1 comment
Labels
Deployment Specification deployment (e.g., to a website). RFC Request for comments. Feature requests and proposed changes.

Comments

@kgryte
Copy link
Contributor

kgryte commented Jul 7, 2022

This RFC is a follow-on proposal to #402 and seeks to propose a process for tracking array API test suite compliance.

Overview

Similar to the motivations for #402, currently, consumers of array libraries lack a centralized mechanism for determining whether any given array API passes the array API test suite.

While #402 seeks to statically record which libraries implement which API starting in which version, this proposal seeks to track behavioral compliance vis-a-vis the test suite and do so over time.

And similar to #402, the end goal is to provide a mechanism for publicly displaying test suite compliance so that users have a centralized resource for determining which libraries satisfy behavioral requirements as specified by the array API specification.

Proposal

This RFC proposes the following workflow:

  1. Configure CI: individual projects should configure their CIs with necessary secrets/meta data for reporting their test results.

    • url: server endpoint for reporting test results.
    • secret: unique secret provided to each array library which must accompany each test report and will be used to validate test payloads.
  2. Test report generation: individual array libraries must run the array API test suite and use pytest-json-report to generate a JSON file containing test results (ref: JSON reporting array-api-tests#131).

    • Recommendation: run the test suite and generate the JSON output for each release. While array libraries are encouraged to continually test against the array API test suite, explicitly generating the JSON output and sending to the server endpoint on every commit and/or nightly is unnecessary.

      graph LR;
          A[Release] --> B[Test suite];
          B -.-> C[JSON report];
      
      Loading
  3. Post Results: upon generating a JSON report, individual array libraries must POST the test results to a server hosted under data-apis.org (e.g., dev.data-apis.org/test/results).

    • Similar to GitHub webhooks, array libraries must generate a hash signature using the unique secret provided to that library. This signature will be used to verify the request payload once received.

    • The following header fields must be set:

      • X-Hub-Signature: hash signature.

        For array libraries using GitHub actions, see the Workflow Webhook Action for an action which automatically computes the X-Hub-Signature header field.

    • The JSON object must have the following fields:

      • schema: schema version for the test results object.
      • name: library name (e.g., numpy).
      • version: library version (e.g., v1.22.1).
      • data: test results.
      • platform: host platform (e.g., macOS-12.4-x86_64-i386-64bit, etc)
      • python: Python version (e.g., v3.9.9)
      • timestamp: ISO 8601 date string for library release/commit (e.g., 2022-06-09T12:47:28Z).
      • test_suite: commit SHA for the Array API test suite.
  4. Server Endpoint: the server receiving test results must

    • Validate: validate request payloads according to the unique secret for the array library associated with the request payload.
    • Persist: store the results in a database.
    • Confirm: confirm receipt of test results and report any errors.

    If the server endpoint already has test results for a particular version, the server endpoint should discard the previous results and only keep the latest results.

    As array library releases are intermittent and relatively infrequent, the server should not experience high load and require significant compute. Accordingly, the server can be readily deployed to Digital Ocean, Linode, or some other hosting service for minimal monthly expense (~$10/month).

  5. Post-processing: test results must be regularly processed and transformed into a form suitable for public consumption.

    • The main goal is to have a post-processing step to filter and transform the "raw" test results into a high-level summary which can be consumed by a web application displaying the results.

    • Processed results could potentially be uploaded to a public repository, if this is deemed useful.

    As test data should be relatively limited, a nightly cron job running on a minimal compute node should be sufficient for processing "raw" test results.

  6. Public Consumption: summarized test results will be displayed on a web page accessible from the array API specification.

Schematically, the workflow is as follows:

graph LR;
    A[Release] --> B[Test suite];
    B --> D[Post to server];
    D --> E[Persist];
    E --> F[Post-process];
    F --> G[Web application];
Loading

Example Workflow

An example GitHub workflow which generates a test suite report for NumPy may be found below.

name: test_suite_reporting

on:
  release:
    types: [published]

jobs:
  test:

    runs-on: ubuntu-latest

    env:
      NUMPY_VERSION: '1.22.1'

    strategy:
      matrix:
        python-version: [3.8, 3.9]

    steps:

      - name: 'Checkout test suite'
        uses: actions/checkout@v1
        with:
          submodules: 'true'

      - name: 'Setup Python ${{ matrix.python-version }}'
        uses: actions/setup-python@v1
        with:
          python-version: ${{ matrix.python-version }}

      - name: 'Install dependencies'
        run: |
          python -m pip install --upgrade pip
          python -m pip install numpy==${{ env.NUMPY_VERSION }}
          python -m pip install pytest-json-report
          python -m pip install -r requirements.txt

      - name: 'Run test suite'
        env:
          ARRAY_API_TESTS_MODULE: numpy.array_api
        run: |
          pytest -v -rxXfE --ci --json-report

      - name: 'Load results'
        id: load-results
        run: |
          results=`cat ./.report.json`
          echo "::set-output name=results::$results"

      - name: 'Post data'
        uses: distributhor/workflow-webhook@v2
        env:
          webhook_url: ${{ secrets.DATA_APIS_WEBHOOK_URL }}
          webhook_secret: ${{ secrets.DATA_APIS_WEBHOOK_SECRET }}
          data: '{ "schema": "v1", "name": "numpy", "version": ${{ "v"+env.NUMPY_VERSION }}, "timestamp": ${{ github.event.release.created_at }}, "platform": "macOS-12.4-x86_64-i386-64bit", "python": ${{ "v"+matrix.python-version }}, "data": ${{ steps.load-results.outputs.results }}, "test_suite": "..." }'

Considerations

  • Why do the individual array libraries need to be involved at all? Can't the consortium setup, maintain, and run CI jobs for tracking test suite compliance?

    • Individual array libraries should be responsible for generating and reporting test suite results because (1) the consortium cannot reasonably be expected to maintain and host CI infrastructure for all current and future array libraries and (2) many individual array libraries are likely to require specialized infrastructure, which they, and they alone, know how to configure, manage, and maintain. Accordingly, individual array libraries are best equipped to generate test results.

      As can be observed in the example workflow above, generating and sending a JSON report imposes minimal requirements on individual array libraries which should already be testing against the array API test suite. This RFC proposes that array libraries (1) invoke pytest with a --json-report flag and (2) POST results to an external endpoint.

Questions

  • Is the workflow reasonable?
  • Should the request payload include additional data? E.g., for libraries capable of running on both CPUs and GPUs, the processor type?
@kgryte kgryte added RFC Request for comments. Feature requests and proposed changes. Deployment Specification deployment (e.g., to a website). labels Jul 7, 2022
@kgryte kgryte added this to the v2022 milestone Jul 7, 2022
@leofang
Copy link
Contributor

leofang commented Jul 7, 2022

cc: @kmaehashi @emcastillo for vis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deployment Specification deployment (e.g., to a website). RFC Request for comments. Feature requests and proposed changes.
Projects
None yet
Development

No branches or pull requests

3 participants