-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test results data structure design #10
Comments
Yes, currently the larger set can be recreated from the sets contained in the individual tests.
👍
Agreed. This reminds me an omission in #6. The
I imagine this will initially be a configured property. Inference will be nice to reach, but it'll be less work to just instruct the host what is being used. Maybe it should be field reported on test plans that is partly or wholly configurable, and inference could be done by another tool as part of the image. /**
* Result from a single test in a test plan.
* @typedef AriaATCIData.TestResult
* @property {number} testId numeric id of a test in a test plan
+ * @property {object[]} commands input commands and the speech emitted
+ * @property {string} commands[].command id of input command sent to system
+ * @property {string} [commands[].output] speech emitted
+ * @property {string[]} [commands[].errors] errors that occured while during command
* @property {object[]} results permutation of input commands and assertions passing or not passing
* @property {string} results[].command id of input command sent to system
* @property {string} results[].expectation description of expected assertion
* @property {boolean} results[].pass did command pass or not pass expectation
*/ |
What do you think about removing the larger set?
Is it accurate to say that this information is ancillary to the test results? Is there any record of the design process which motivated this feature? |
Additionally, the design would be more intuitive if it had stronger differentiation in terminology. The word "result" is currently used to describe three distinct concepts:
In this domain, a "test plan" is a collection of "tests," so maybe it works for "test plan results" to contain multiple "test results." The third use of the term is harder to understand, though. @mzgoddard why is the test ID a property of the TestResult (i.e. /**
* Result from a test plan.
* @typedef AriaATCIData.TestPlanResult
* @property {string} name name of the test plan, defaults to 'unknown'
* @property {AriaATCIData.Log[]} log debug messages emitted during execution of test plan
* @property {object[]} tests
* @property {string} tests[].filepath filepath of file describing the test in the test plan
+ * @property {number} tests[].id numeric id of a test in a test plan
* @property {AriaATCIData.Log[]} tests[].log subset of log emitted during this single test
* @property {AriaATCIData.TestResult[]} tests[].results
*/
/**
* Result from a single test in a test plan.
* @typedef AriaATCIData.TestResult
- * @property {number} testId numeric id of a test in a test plan
* @property {object[]} results permutation of input commands and assertions passing or not passing
* @property {string} results[].command id of input command sent to system
* @property {string} results[].expectation description of expected assertion
* @property {boolean} results[].pass did command pass or not pass expectation
*/ Then the "permutation[s] of input commands and assertions passing or not passing" could be consolidated into a single array, like this: /**
* Result from a test plan.
* @typedef AriaATCIData.TestPlanResult
* @property {string} name name of the test plan, defaults to 'unknown'
* @property {AriaATCIData.Log[]} log debug messages emitted during execution of test plan
* @property {object[]} tests
* @property {string} tests[].filepath filepath of file describing the test in the test plan
* @property {number} tests[].id numeric id of a test in a test plan
* @property {AriaATCIData.Log[]} tests[].log subset of log emitted during this single test
* @property {AriaATCIData.TestResult[]} tests[].results
*/
/**
* Result from a single test in a test plan.
* @typedef AriaATCIData.TestResult
- * @property {object[]} results permutation of input commands and assertions passing or not passing
- * @property {string} results[].command id of input command sent to system
+ * @property {string} command id of input command sent to system
- * @property {string} results[].expectation description of expected assertion
+ * @property {string} expectation description of expected assertion
- * @property {boolean} results[].pass did command pass or not pass expectation
+ * @property {boolean} pass did command pass or not pass expectation
*/ |
@jugglinmike since the time of this issue's posting, the TestResult type has changed somewhat significantly. Here's the latest:
One thing I'm noticing from the current version of the output is that the data stored in the TestResult.results array is not incredibly valuable since we're not actually using the I'm wondering what you think of this option for rewriting this data structure, so that the property names a little bit more meaningful? /**
* Result from a single test in a test plan.
* @typedef AriaATCIData.TestResult
* @property {number} testId numeric id of a test in a test plan
* @property {object[]} commands input commands and the speech emitted
* @property {string} commands[].command id of input command sent to system
* @property {string} [commands[].output] speech emitted
* @property {string[]} [commands[].errors] errors that occured while during command
* @property {Record<string, string>} capabilities Information about the system under test - * @property {object[]} results permutation of input commands and assertions passing or not passing
- * @property {string} results[].command id of input command sent to system
- * @property {string} results[].expectation description of expected assertion
- * @property {boolean} results[].pass did command pass or not pass expectation
+ * @property {object[]} assertions[] permutation of input commands and assertions
+ * @property {string} [assertions[].command] id of input command sent to system
+ * @property {string} [assertion[].expectation] description of expected assertion
*/ Or maybe the assertion expectations can just be nested under each /**
* Result from a single test in a test plan.
* @typedef AriaATCIData.TestResult
* @property {number} testId numeric id of a test in a test plan
* @property {object[]} commands input commands and the speech emitted
* @property {string} commands[].command id of input command sent to system
* @property {string} [commands[].output] speech emitted
* @property {string[]} [commands[].errors] errors that occured while during command
+ * @property {string[]} commands[].expectations description of expected assertion
* @property {Record<string, string>} capabilities Information about the system under test - * @property {object[]} results permutation of input commands and assertions passing or not passing
- * @property {string} results[].command id of input command sent to system
- * @property {string} results[].expectation description of expected assertion
- * @property {boolean} results[].pass did command pass or not pass expectation
*/ Here's some sample output with the last proposal for bundling everything under
|
Thanks for bringing this up, @ChrisC --the design definitely warrants more thought. Two things come to mind: how we describe the observed behavior and how we structure the assertions. Observed behaviorWe moved away from describing AT behavior as "output" a while back. We started using the more generic term "response" in the interest of making the system amenable to supporting assistive tech beyond screen readers. That's why we should probably change the property name Honestly, that modification alone doesn't sufficiently honor the spirit of the change in terminology. If we only change the property name, we'll still be presuming that every AT response can be encoded as a single string. I can imagine designs which might address this (e.g. Assertion structureWhile it's true that we don't use the "pass" property now, I've recently proposed an extension which would make it meaningful... Or something like it, anyway. Today, the Community Group and the app refer to this as a "verdict." That said, we don't necessarily have to name it in the current design, but I would like to reserve space for it in order to limit the churn from a future extension. Nesting under "commands" seems good to me; maybe we can just use an object. New proposalBuilding on your final design: /**
* Result from a single test in a test plan.
* @typedef AriaATCIData.TestResult
* @property {number} testId numeric id of a test in a test plan
* @property {object[]} commands input commands and the speech emitted
* @property {string} commands[].command id of input command sent to system
- * @property {string} [commands[].output] speech emitted
+ * @property {string} [commands[].response] speech emitted
* @property {string[]} [commands[].errors] errors that occured while during command
- * @property {string[]} commands[].expectations description of expected assertion
+ * @property {object[]} commands[].assertions
+ * @property {string} commands[].assertions[].expectation
+ * @property {"pass"|"fail"|null} commands[].assertions[].verdict
* @property {Record<string, string>} capabilities Information about the system under test - * @property {object[]} results permutation of input commands and assertions passing or not passing
*/ I can take or leave the "verdict" property for now--we'd just always set it to
|
All this makes sense to me @jugglinmike ! I'll go with this new proposal recommendation and will go ahead and include the 'verdict' property and set it to |
This project currently emits data collected from executing tests as JSON-formatted text. I'd like to discuss improvements to the structure and content of those reports.
Here's what the data structure currently looks like (expressed using JSDoc-style JavaScript code comments):
@mzgoddard As described, the two properties named
log
sound like they may be redundant. Could the "debug messages emitted during execution of test plan" be recreated from the "subset of log emitted during [each] single test"?Some suggestions:
The text was updated successfully, but these errors were encountered: