RFC: tracking array API test suite compliance #462

kgryte · 2022-07-07T09:34:34Z

This RFC is a follow-on proposal to #402 and seeks to propose a process for tracking array API test suite compliance.

Overview

Similar to the motivations for #402, currently, consumers of array libraries lack a centralized mechanism for determining whether any given array API passes the array API test suite.

While #402 seeks to statically record which libraries implement which API starting in which version, this proposal seeks to track behavioral compliance vis-a-vis the test suite and do so over time.

And similar to #402, the end goal is to provide a mechanism for publicly displaying test suite compliance so that users have a centralized resource for determining which libraries satisfy behavioral requirements as specified by the array API specification.

Proposal

This RFC proposes the following workflow:

Configure CI: individual projects should configure their CIs with necessary secrets/meta data for reporting their test results.
- url: server endpoint for reporting test results.
- secret: unique secret provided to each array library which must accompany each test report and will be used to validate test payloads.
Test report generation: individual array libraries must run the array API test suite and use pytest-json-report to generate a JSON file containing test results (ref: JSON reporting array-api-tests#131).
- Recommendation: run the test suite and generate the JSON output for each release. While array libraries are encouraged to continually test against the array API test suite, explicitly generating the JSON output and sending to the server endpoint on every commit and/or nightly is unnecessary.
  graph LR; A[Release] --> B[Test suite]; B -.-> C[JSON report];
  Loading
Post Results: upon generating a JSON report, individual array libraries must POST the test results to a server hosted under data-apis.org (e.g., dev.data-apis.org/test/results).
- Similar to GitHub webhooks, array libraries must generate a hash signature using the unique secret provided to that library. This signature will be used to verify the request payload once received.
- The following header fields must be set:
  - X-Hub-Signature: hash signature.
    
    For array libraries using GitHub actions, see the Workflow Webhook Action for an action which automatically computes the X-Hub-Signature header field.
- The JSON object must have the following fields:
  - schema: schema version for the test results object.
  - name: library name (e.g., numpy).
  - version: library version (e.g., v1.22.1).
  - data: test results.
  - platform: host platform (e.g., macOS-12.4-x86_64-i386-64bit, etc)
  - python: Python version (e.g., v3.9.9)
  - timestamp: ISO 8601 date string for library release/commit (e.g., 2022-06-09T12:47:28Z).
  - test_suite: commit SHA for the Array API test suite.
Server Endpoint: the server receiving test results must
- Validate: validate request payloads according to the unique secret for the array library associated with the request payload.
- Persist: store the results in a database.
- Confirm: confirm receipt of test results and report any errors.
If the server endpoint already has test results for a particular version, the server endpoint should discard the previous results and only keep the latest results.

As array library releases are intermittent and relatively infrequent, the server should not experience high load and require significant compute. Accordingly, the server can be readily deployed to Digital Ocean, Linode, or some other hosting service for minimal monthly expense (~$10/month).
Post-processing: test results must be regularly processed and transformed into a form suitable for public consumption.
- The main goal is to have a post-processing step to filter and transform the "raw" test results into a high-level summary which can be consumed by a web application displaying the results.
- Processed results could potentially be uploaded to a public repository, if this is deemed useful.
As test data should be relatively limited, a nightly cron job running on a minimal compute node should be sufficient for processing "raw" test results.
Public Consumption: summarized test results will be displayed on a web page accessible from the array API specification.
- Display should be similar to that described in RFC: tracking array API compliance #402.
- Hosted alongside, and accessible from, the array API specification website.

Schematically, the workflow is as follows:

graph LR;
    A[Release] --> B[Test suite];
    B --> D[Post to server];
    D --> E[Persist];
    E --> F[Post-process];
    F --> G[Web application];

Example Workflow

An example GitHub workflow which generates a test suite report for NumPy may be found below.

name: test_suite_reporting

on:
  release:
    types: [published]

jobs:
  test:

    runs-on: ubuntu-latest

    env:
      NUMPY_VERSION: '1.22.1'

    strategy:
      matrix:
        python-version: [3.8, 3.9]

    steps:

      - name: 'Checkout test suite'
        uses: actions/checkout@v1
        with:
          submodules: 'true'

      - name: 'Setup Python ${{ matrix.python-version }}'
        uses: actions/setup-python@v1
        with:
          python-version: ${{ matrix.python-version }}

      - name: 'Install dependencies'
        run: |
          python -m pip install --upgrade pip
          python -m pip install numpy==${{ env.NUMPY_VERSION }}
          python -m pip install pytest-json-report
          python -m pip install -r requirements.txt

      - name: 'Run test suite'
        env:
          ARRAY_API_TESTS_MODULE: numpy.array_api
        run: |
          pytest -v -rxXfE --ci --json-report

      - name: 'Load results'
        id: load-results
        run: |
          results=`cat ./.report.json`
          echo "::set-output name=results::$results"

      - name: 'Post data'
        uses: distributhor/workflow-webhook@v2
        env:
          webhook_url: ${{ secrets.DATA_APIS_WEBHOOK_URL }}
          webhook_secret: ${{ secrets.DATA_APIS_WEBHOOK_SECRET }}
          data: '{ "schema": "v1", "name": "numpy", "version": ${{ "v"+env.NUMPY_VERSION }}, "timestamp": ${{ github.event.release.created_at }}, "platform": "macOS-12.4-x86_64-i386-64bit", "python": ${{ "v"+matrix.python-version }}, "data": ${{ steps.load-results.outputs.results }}, "test_suite": "..." }'

Considerations

Why do the individual array libraries need to be involved at all? Can't the consortium setup, maintain, and run CI jobs for tracking test suite compliance?
- Individual array libraries should be responsible for generating and reporting test suite results because (1) the consortium cannot reasonably be expected to maintain and host CI infrastructure for all current and future array libraries and (2) many individual array libraries are likely to require specialized infrastructure, which they, and they alone, know how to configure, manage, and maintain. Accordingly, individual array libraries are best equipped to generate test results.
  
  As can be observed in the example workflow above, generating and sending a JSON report imposes minimal requirements on individual array libraries which should already be testing against the array API test suite. This RFC proposes that array libraries (1) invoke pytest with a --json-report flag and (2) POST results to an external endpoint.

Questions

Is the workflow reasonable?
Should the request payload include additional data? E.g., for libraries capable of running on both CPUs and GPUs, the processor type?

The text was updated successfully, but these errors were encountered:

leofang · 2022-07-07T17:38:48Z

cc: @kmaehashi @emcastillo for vis

kgryte added RFC Request for comments. Feature requests and proposed changes. Deployment Specification deployment (e.g., to a website). labels Jul 7, 2022

kgryte added this to the v2022 milestone Jul 7, 2022

rgommers mentioned this issue Aug 4, 2022

Describe what extensions are and how to use them #470

Merged

rgommers removed this from the v2022 milestone Dec 14, 2022

ClaudiaComito mentioned this issue Aug 28, 2023

Array API compliance helmholtz-analytics/heat#774

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: tracking array API test suite compliance #462

RFC: tracking array API test suite compliance #462

kgryte commented Jul 7, 2022 •

edited

Loading

leofang commented Jul 7, 2022

RFC: tracking array API test suite compliance #462

RFC: tracking array API test suite compliance #462

Comments

kgryte commented Jul 7, 2022 • edited Loading

Overview

Proposal

Example Workflow

Considerations

Questions

leofang commented Jul 7, 2022

kgryte commented Jul 7, 2022 •

edited

Loading