Scoping automated testing steps #95

trallard · 2022-04-09T13:00:44Z

Summary

Especially for the first iteration of what needs to be tested through CI we need to have a path of what needs testing.
For example - menu bars, contrast

From @gabalafou

So I think there’s a big place for you in helping with the design and scoping for whatever the thing is that we are going to build for automated testing.

Tasks to complete

Roughly:

@isabela-pf work on scoping the first iteration of testing: keeping in mind in an ideal world we would like to test everything but this is impossible, so instead, we will aim for what definitely needs testing
Discuss with the rest of the @Quansight-Labs/czi-a11y-grant team
Add to the plan derived from Create an RFD for the testing approach #94 Decide on framework (proposed: jest-axe and Galata) #51 [Testing] - Write up approach for automated testing #67

Format: I would suggest a lightweight version of our RFD template

isabela-pf · 2022-04-14T19:28:36Z

I've started a draft spreadsheet to help me keep track of everything we can do as we scope it down. I've already marked some WCAG guidelines as not being relevant to JupyterLab (like having captions for videos).

isabela-pf · 2022-04-22T04:04:23Z

We had a longer discussion around a sub-task for this issue at our April 20 meeting #99; which JupyterLab “pages” (axe-core’s understanding of a single state in JupyterLab) do we want to start testing. The proposal in #97 listed this as 2–5 “pages” for JupyterLab to begin testing with. I asked for feedback on my initial thoughts, and we discussed a few different approaches to this decision making:

Most common states encountered and/or first states encountered (ie. the default launcher).
A maximalist approach (where we try to cover the most JupyterLab areas in the least number of “pages;” may also help avoid some repeated violations since there may be less repeated information).
An isolated UI element approach (where we load parts of JupyterLab, not just a whole).
By prioritizing states of JupyterLab that have their own URLs (because these are the most stable and reproducible).
By using the default for each built-in UI mode (ie. JupyterLab, presentation mode, etc.)

After discussion, we agreed to begin this first six weeks of testing with a focus on

Default JupyterLab with the launcher
Default JupyterLab with an open notebook (probably the Lorenz notebook), not yet run
Default JupyterLab with an open notebook (probably the Lorenz notebook), all cells run

since they are the states that precede all others in a user’s interaction. We acknowledge that this approach does not yet include major areas of the interface that will be critical to JupyterLab’s accessibility (such as the top menus or the settings editor) and that it needs to in the future. I in particular want to make sure these are covered, but I also agree with counterarguments that this involves a lot of states sooner than we have the structure to handle.

isabela-pf · 2022-04-28T16:39:53Z

I have a first pass at ideas for what @gabalafou called the "Three to five handwritten machine-as-a-user tests" (in #97). This truly is my first attempt, so I expect needing to rework this a ton based on feedback. But now we have something to critique!

Also, feedback on format is as welcome as content; I'm not sure this is the best way to communicate this to y'all.

How I chose these options

For a little background, I chose the following based on:

Focusing on WCAG areas not covered by axe-core
Filtering out WCAG areas that I think aren't relevant to JupyterLab (ie. live captioning videos)
Prioritizing things that would block navigating or reading JupyterLab (since that is a big blocker)
Prioritizing options that I was able to think of more concrete success criteria (making them more automatically testable, in my opinion)

If you find any issue with this approach, it'd be good to know. (This was all done in the aforementioned draft spreadsheet.)

Test proposals

These are broken up into the WCAG area they reference, how I think we could interpret them as success criteria specifically in JupyterLab (rather than the success criteria they define for all web content), and a list of steps I think would help us test for this success criteria (written from a manual testing perspective).

1.3.4 - Orientation

Proposed JLab success criteria

JupyterLab is responsive. When switched to portrait orientation or viewed on mobile, no UI content is lost.

Proposed step-by-step

Open default JupyterLab
Set viewport orientation to portrait (And/or mobile viewport? I could see this as a good way to test both, but perhaps they should be different tests.)
Take screenshot
Compare to predefined success screenshot
Success if screenshots match.

Note: I think the current portrait and/or mobile mode could be improved, but we can definitely start testing with what we have now.

2.1.2 No keyboard trap

Proposed JLab success criteria

Focusable areas in JupyterLab can all be unfocused. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab's menu bar can be focused and unfocused.

Proposed step-by-step

Open default JupyterLab
Start focus at top of tree (may hit skip link?)
Tab into menu bar
Open file menu
Close file menu
Tab out of menu bar
Success if focus switches to side bar/file browser

2.4.3 Focus Order

Proposed JLab success criteria

In JupyterLab, areas can be focused in the following order:

Skip link
Menu bar
Left side bar
Inside left side bar (selected section)
Top of document area (document toolbar first if it has one)
Document (if there is no toolbar for the document type, users go immediately into the document)
Right side bar
Inside right side bar (selected section)
Status bar

(Giving credit! I was informed by this discussion in a past JupyterLab accessibility meeting to propose this order.)

Proposed step-by-step

Open default JupyterLab
Tab to focus menu bar
Tab through major regions as needed (see above section)
Success if tab brings focus to status bar

2.5.6 Concurrent input mechanisms

Proposed JLab success criteria

In JupyterLab, a single task can be completed using mouse, keyboard, and touch screen inputs. This works even when completing a single task. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab can open a new notebook from the launcher with mouse, keyboard, and touch screen inputs.

Proposed step-by-step

Open default JupyterLab
Open the file menu with a mouse click
Navigate to menu New Launcher with arrow keys
Use touch screen input to create new Notebook from Launcher
Success if new notebook opens

I'm not totally sure how this works in terms of testing, but it is my understanding that the type of input can be simulated. Please let me know if I'm wrong about this.

isabela-pf · 2022-04-28T17:32:11Z

I had a few other thoughts while working on this that I wanted to write down. Most of these are about testing patterns that may help us long term.

Categories of automated tests

Based on the content we are testing for accessibility, I think I'm seeing a pattern in certain approaches suiting different types of content. This isn't too well thought out yet, but I could see it as helping us know what kind of test we need for what.

Visual content (ie. layout changes, responsiveness, color) seems suited for tests that take and compare UI screenshots.
Interactive content/user interactions (ie. focus, navigation, editing) seem suited for tests that mirror their kind of task completion. These are tasks where you script actions to get to a designated "finish line."
Transforming content (ie. editing UI element sizes, color, font, or any settings configuration) is the one I'm least sure about, and may be the one that is hardest or not automatically-testable. I could imagine cases where you could also compare screenshots, but that doesn't seem like a true test of the functionality.
Audio content isn't something in JupyterLab by default, so I don't have thoughts on tests for this at the moment. (Same for haptic.)

What needs to be tested

In my above comment and in last week's team update meeting, we talked a lot about what needs to be tested. Scoping tests is important for both our small team and for respecting contributor time (as has been mentioned in other issues). Between this work, I also think certain accessibility needs fall under certain categories of how they can be optimally tested.

I broke this up into tests that need to run an entire "page," tests that do not need to run an entire "page," and tests that would benefit being run on a section of a page or single UI component.

Some tests do need to run the entire "page" (as in JupyterLab layout in full, no matter the mode).
- For example page titled will fail in any test not utilizing the whole "page."
- Focus order similarly doesn't make as much sense on a less-than-"page" test.
- Orientation is another example.
Some tests do not need to run the whole page
- For example, non-text content can and often is run over a whole page, but you could just as well test individual parts if you found it more helpful or less taxing on the testing suite.
- In fact, this may give us more specific and actionable feedback on these items, because we would know exactly where it fails, not just that it does. This could be a part of avoiding duplicated failures as well.
Some tests would benefit being run on a section of a page or single UI component
- For example, I think testing a whole page for keyboard traps is possible but complicated. Because a keyboard trap seems most likely to happen on a region by region basis (ie. the trap in JupyterLab's menu bar, but not in the left side bar), it is easier to test that way. Plus, then you know where is trapping and where is not.
- I like this most if we run the same or equivalent tests over different, isolated UI regions to closely identify where they fail and succeed.
- Using the keyboard trap example again, I could find it helpful to get results that all keyboard trap tests passed except for the menu bar one (rather than getting a general test over the whole page telling me there is a keyboard trap somewhere and knowing no region past that trap was able to be tested).

gabalafou · 2022-05-02T14:02:27Z

I love this!

I'm going to jot down a few quick thoughts before our team meeting:

For the machine-as-a-user tests, are they in priority order? Ideally, we will implement all of them during this project cycle, but if we can't, it will be good to know which ones we should start with.
Eventually, we will likely want to write several Playwright tests for a given criteria, and some tests might relate to more than one WCAG guideline. For example, the test plan that you wrote under "no keyboard traps," might more properly be thought of as: no keyboard trap in the menu bar. And we'll probably want to have another test like "no keyboard trap in the notebook" etc. In database-speak, instead of a one-to-one relationship between a test and a WCAG guideline, we'll want a one-to-many or a many-to-many relationship, I think—as in several tests might map to the same guideline, plus (although I'm not sure how often this one will happen) a single test might map to several guidelines.
I'm starting to sense that we're really going to want to be able to test chunks of the UI in isolation. This will require some work and exploration that I didn't anticipate for this cycle, but it might be important enough to figure out how to do this now and work into this cycle somehow.

isabela-pf · 2022-05-02T18:04:29Z

For the machine-as-a-user tests, are they in priority order?

Nope, just order of appearance in WCAG. I think we have a number of obstacles to overcome in order to get any handwritten tests like this started, and at this phase I'd rather focus on that infrastructure. Whichever test works the best with the rest of the work is what I'd prioritize for now.

we'll want a one-to-many or a many-to-many relationship, I think—as in several tests might map to the same guideline, plus (although I'm not sure how often this one will happen) a single test might map to several guidelines.

Agreed! Because I don't see it in any other notes, this may also end up fitting nicely with the work already done at ACT rules.

require some work and exploration that I didn't anticipate for this cycle

I personally do not see the isolated UI as high priority. I do think long term it is ideal, but I think we're early enough in this process that I don't consider it a blocker or something worth changing your current goals for. I pointed it out only because I wanted to gauge how you all felt about that/just to generally share my thoughts so this is done as openly as possible.

Other notes:

Be sure if/when testing JupyterLab with an open document (probably notebook), we are communicating clearly that these tests are not focused on testing document content. Don't take the reports they produce as an gauge of the notebook's accessibility.
@trallard linked a quick guide to automated test scripts.

isabela-pf · 2022-05-05T19:40:52Z

What's changed

There is repeated content from above just so all the info stays together. Because I know this can feel like a lot at once, I want to point out what changed.

I updated the script for 1.3.4 Orientation to be more specific and avoid the screen shot technique.
All scripts now have an "expected behavior" per step.
Added new questions at the bottom.

@gabalafou @tonyfast @trallard I'm @/ing you for review.

Test proposals

1.3.4 - Orientation

Proposed JLab success criteria

JupyterLab is responsive. When switched to portrait orientation or viewed on mobile, no UI content is lost.

Proposed testing script

Step	Expected Behavior
1. Open default JupyterLab	JupyterLab opens with unmodified workspace.
2. Set viewport orientation to portrait (And/or mobile viewport?)	JupyterLab accepts the orientation change and doesn't error out.
3. Check menu bar is in expected location	Menu bar is at the top of the page and have all menu items visible (currently it has a scroll bar).
4. Check left side bar is in expected location	Left side bar is the leftmost part of the viewport . It stretches from the menu bar to status bar. All icons are visible.
5. Check document area is in expected location	Document area is the center and majority of the viewport.
6. Check document area toolbar is in expected location	The document area toolbar is at the top of the document area. All items are visible (currently it has a scroll bar).
7. Check right side bar is in expected location	Right side bar is the rightmost part of the viewport. It stretches from the menu bar to status bar. All icons are visible. (Right now, I believe this side bar is not able to be accessed in this mode.)
8. Check status bar is in expected location	The status bar is at the bottom of the page. All information is visible.
9. Success if all main regions are in expected location

2.1.2 No keyboard trap

Proposed JLab success criteria

Focusable areas in JupyterLab can all be unfocused. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab's menu bar can be focused and unfocused.

Proposed testing script

Step	Expected Behavior
1. Open default JupyterLab	JupyterLab opens with unmodified workspace.
2. Start focus at top of tree	Focus goes to JupyterLab tab, may hit skip link.
3. Tab into menu bar	Focus goes to menu bar (whole).
4. Open file menu	Focus goes to File menu (within menu bar). Menu bar opens full list of menu items.
5. Close file menu	Focus stays on File menu, but menu bar is closed.
6. Tab out of menu bar	Focus moves from File menu, to other menu items until it leaves the region. Focus will move the left side bar/file browser.
7. Success if focus switches to side bar/file browser

2.4.3 Focus Order

Proposed JLab success criteria

In JupyterLab, areas can be focused in the following order:

Skip link
Menu bar
Left side bar
Inside left side bar (selected section)
Top of document area (document toolbar first if it has one)
Document (if there is no toolbar for the document type, users go immediately into the document)
Right side bar
Inside right side bar (selected section)
Status bar

Proposed testing script

Step	Expected Behavior
1. Open default JupyterLab	JupyterLab opens with unmodified workspace.
2. Tab to focus menu bar	Tab until focus is on the menu bar. (Will this run into the skip link?)
3. Tab through major regions as needed (see above section)	Tab to move focus through left side bar, inside left side bar, top of document area, document area, right side bar, and inside right side bar.
4. Tab to focus status bar	Focus moves to status bar.
5. Success if tab brings focus to status bar

2.5.6 Concurrent input mechanisms

Proposed JLab success criteria

In JupyterLab, a single task can be completed using mouse, keyboard, and touch screen inputs. This works even when completing a single task. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab can open a new notebook from the launcher with mouse, keyboard, and touch screen inputs.

Proposed testing script

Step	Expected Behavior
1. Open default JupyterLab	JupyterLab opens with unmodified workspace.
2. Open the file menu with a mouse click	File menu opens and full list of menu items appears.
3. Navigate to menu item New Launcher with arrow keys	Focus moves through the File menu list items until it reaches the New Launcher item.
4. Use touch screen input to create new Notebook from Launcher	The New Notebook from Launcher is selected and the command is initiated.
5. Success if new notebook opens

Questions!

1.3.4 Orientation is about not only being able to change the orientation, but to also not lose information (UI in this case) when switched. Should something about checking area content be added as additional steps, or is there a better way to do this?

trallard · 2022-05-12T17:50:10Z

@gabalafou @gabalafou @tonyfast to review and prioritise the scenarios
@isabela-pf - to suggest 1-2 starting scenarios

isabela-pf · 2022-05-12T18:00:50Z

Based on synchronous feedback, this is close to complete (for this first round of test development).

I need to document a template for these testing scripts so that other people may contribute testing scripts in the future.

Responding to @trallard's comment, I think any of these scenarios can be tested first based on what's easiest for development to start with. If it's helpful for me to choose, I think 2.1.2 No keyboard trap is the easiest to start with because

it can actually pass in JupyterLab now (compared to focus order, which will 100% fail right now)
it can run well with a full "page" of JupyterLab and doesn't need a subset of the UI
it's not blocked by any other tests

trallard · 2022-05-23T14:52:50Z

@isabela-pf to add the scripts as a PR to the accessibility repo

isabela-pf · 2022-05-27T19:20:19Z

Re: myself

Current testing script proposals and template are in Quansight-Labs/accessibility #6

trallard added status: to do 📬 Queued for action type: deliverable 📦 Marks an issue as a deliverable for the grant type: scoping 🔎 Used to scope a project, implementation or evaluate steps moving forward labels Apr 9, 2022

trallard added this to the Sprint 1 - Kale 🌱 milestone Apr 9, 2022

trallard assigned isabela-pf Apr 9, 2022

isabela-pf mentioned this issue Apr 14, 2022

Team update - Apr 13, 2022 #98

Closed

trallard added status: in progress 🏗 Currently being worked on and removed status: to do 📬 Queued for action labels Apr 15, 2022

isabela-pf mentioned this issue Apr 21, 2022

Team update - Apr 20, 2022 #99

Closed

isabela-pf mentioned this issue Apr 28, 2022

Team update - Apr 27, 2022 #100

Closed

isabela-pf mentioned this issue May 5, 2022

Team update - May 04, 2022 #101

Closed

gabalafou mentioned this issue May 14, 2022

Finish implementation of automated scripts - 🌱 Kale #116

Closed

trallard mentioned this issue May 17, 2022

Investigate accessibility statements #52

Closed

trallard mentioned this issue May 23, 2022

Team update - May 18, 2022 #124

Closed

trallard closed this as completed May 27, 2022

trallard mentioned this issue May 27, 2022

Document and publish the scripts, approach, and scope #113

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scoping automated testing steps #95

Scoping automated testing steps #95

trallard commented Apr 9, 2022 •

edited

Loading

isabela-pf commented Apr 14, 2022 •

edited

Loading

isabela-pf commented Apr 22, 2022

isabela-pf commented Apr 28, 2022

isabela-pf commented Apr 28, 2022

gabalafou commented May 2, 2022

isabela-pf commented May 2, 2022

isabela-pf commented May 5, 2022

trallard commented May 12, 2022

isabela-pf commented May 12, 2022 •

edited

Loading

trallard commented May 23, 2022

isabela-pf commented May 27, 2022

Scoping automated testing steps #95

Scoping automated testing steps #95

Comments

trallard commented Apr 9, 2022 • edited Loading

Summary

Tasks to complete

isabela-pf commented Apr 14, 2022 • edited Loading

isabela-pf commented Apr 22, 2022

isabela-pf commented Apr 28, 2022

How I chose these options

Test proposals

Proposed JLab success criteria

Proposed step-by-step

Proposed JLab success criteria

Proposed step-by-step

Proposed JLab success criteria

Proposed step-by-step

Proposed JLab success criteria

Proposed step-by-step

isabela-pf commented Apr 28, 2022

Categories of automated tests

What needs to be tested

gabalafou commented May 2, 2022

isabela-pf commented May 2, 2022

isabela-pf commented May 5, 2022

What's changed

Test proposals

Proposed JLab success criteria

Proposed testing script

Proposed JLab success criteria

Proposed testing script

Proposed JLab success criteria

Proposed testing script

Proposed JLab success criteria

Proposed testing script

Questions!

trallard commented May 12, 2022

isabela-pf commented May 12, 2022 • edited Loading

trallard commented May 23, 2022

isabela-pf commented May 27, 2022

trallard commented Apr 9, 2022 •

edited

Loading

isabela-pf commented Apr 14, 2022 •

edited

Loading

isabela-pf commented May 12, 2022 •

edited

Loading