Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoping automated testing steps #95

Closed
trallard opened this issue Apr 9, 2022 · 11 comments
Closed

Scoping automated testing steps #95

trallard opened this issue Apr 9, 2022 · 11 comments
Assignees
Labels
status: in progress 🏗 Currently being worked on type: deliverable 📦 Marks an issue as a deliverable for the grant type: scoping 🔎 Used to scope a project, implementation or evaluate steps moving forward

Comments

@trallard
Copy link
Member

trallard commented Apr 9, 2022

Summary

Especially for the first iteration of what needs to be tested through CI we need to have a path of what needs testing.
For example - menu bars, contrast

From @gabalafou

So I think there’s a big place for you in helping with the design and scoping for whatever the thing is that we are going to build for automated testing.

Tasks to complete

Roughly:

  1. @isabela-pf work on scoping the first iteration of testing: keeping in mind in an ideal world we would like to test everything but this is impossible, so instead, we will aim for what definitely needs testing
  2. Discuss with the rest of the @Quansight-Labs/czi-a11y-grant team
  3. Add to the plan derived from Create an RFD for the testing approach #94 Decide on framework (proposed: jest-axe and Galata) #51 [Testing] - Write up approach for automated testing #67

Format: I would suggest a lightweight version of our RFD template

@trallard trallard added status: to do 📬 Queued for action type: deliverable 📦 Marks an issue as a deliverable for the grant type: scoping 🔎 Used to scope a project, implementation or evaluate steps moving forward labels Apr 9, 2022
@trallard trallard added this to the Sprint 1 - Kale 🌱 milestone Apr 9, 2022
@isabela-pf
Copy link
Contributor

isabela-pf commented Apr 14, 2022

I've started a draft spreadsheet to help me keep track of everything we can do as we scope it down. I've already marked some WCAG guidelines as not being relevant to JupyterLab (like having captions for videos).

@trallard trallard added status: in progress 🏗 Currently being worked on and removed status: to do 📬 Queued for action labels Apr 15, 2022
@isabela-pf
Copy link
Contributor

We had a longer discussion around a sub-task for this issue at our April 20 meeting #99; which JupyterLab “pages” (axe-core’s understanding of a single state in JupyterLab) do we want to start testing. The proposal in #97 listed this as 2–5 “pages” for JupyterLab to begin testing with. I asked for feedback on my initial thoughts, and we discussed a few different approaches to this decision making:

  • Most common states encountered and/or first states encountered (ie. the default launcher).
  • A maximalist approach (where we try to cover the most JupyterLab areas in the least number of “pages;” may also help avoid some repeated violations since there may be less repeated information).
  • An isolated UI element approach (where we load parts of JupyterLab, not just a whole).
  • By prioritizing states of JupyterLab that have their own URLs (because these are the most stable and reproducible).
  • By using the default for each built-in UI mode (ie. JupyterLab, presentation mode, etc.)

After discussion, we agreed to begin this first six weeks of testing with a focus on

  • Default JupyterLab with the launcher
  • Default JupyterLab with an open notebook (probably the Lorenz notebook), not yet run
  • Default JupyterLab with an open notebook (probably the Lorenz notebook), all cells run

since they are the states that precede all others in a user’s interaction. We acknowledge that this approach does not yet include major areas of the interface that will be critical to JupyterLab’s accessibility (such as the top menus or the settings editor) and that it needs to in the future. I in particular want to make sure these are covered, but I also agree with counterarguments that this involves a lot of states sooner than we have the structure to handle.

@isabela-pf
Copy link
Contributor

I have a first pass at ideas for what @gabalafou called the "Three to five handwritten machine-as-a-user tests" (in #97). This truly is my first attempt, so I expect needing to rework this a ton based on feedback. But now we have something to critique!

Also, feedback on format is as welcome as content; I'm not sure this is the best way to communicate this to y'all.

How I chose these options

For a little background, I chose the following based on:

  • Focusing on WCAG areas not covered by axe-core
  • Filtering out WCAG areas that I think aren't relevant to JupyterLab (ie. live captioning videos)
  • Prioritizing things that would block navigating or reading JupyterLab (since that is a big blocker)
  • Prioritizing options that I was able to think of more concrete success criteria (making them more automatically testable, in my opinion)

If you find any issue with this approach, it'd be good to know. (This was all done in the aforementioned draft spreadsheet.)

Test proposals

These are broken up into the WCAG area they reference, how I think we could interpret them as success criteria specifically in JupyterLab (rather than the success criteria they define for all web content), and a list of steps I think would help us test for this success criteria (written from a manual testing perspective).

1.3.4 - Orientation

Proposed JLab success criteria

JupyterLab is responsive. When switched to portrait orientation or viewed on mobile, no UI content is lost.

Proposed step-by-step

  1. Open default JupyterLab
  2. Set viewport orientation to portrait (And/or mobile viewport? I could see this as a good way to test both, but perhaps they should be different tests.)
  3. Take screenshot
  4. Compare to predefined success screenshot
  5. Success if screenshots match.

Note: I think the current portrait and/or mobile mode could be improved, but we can definitely start testing with what we have now.

2.1.2 No keyboard trap

Proposed JLab success criteria

Focusable areas in JupyterLab can all be unfocused. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab's menu bar can be focused and unfocused.

Proposed step-by-step

  1. Open default JupyterLab
  2. Start focus at top of tree (may hit skip link?)
  3. Tab into menu bar
  4. Open file menu
  5. Close file menu
  6. Tab out of menu bar
  7. Success if focus switches to side bar/file browser

2.4.3 Focus Order

Proposed JLab success criteria

In JupyterLab, areas can be focused in the following order:

  1. Skip link
  2. Menu bar
  3. Left side bar
  4. Inside left side bar (selected section)
  5. Top of document area (document toolbar first if it has one)
  6. Document (if there is no toolbar for the document type, users go immediately into the document)
  7. Right side bar
  8. Inside right side bar (selected section)
  9. Status bar

(Giving credit! I was informed by this discussion in a past JupyterLab accessibility meeting to propose this order.)

Proposed step-by-step

  1. Open default JupyterLab
  2. Tab to focus menu bar
  3. Tab through major regions as needed (see above section)
  4. Success if tab brings focus to status bar

2.5.6 Concurrent input mechanisms

Proposed JLab success criteria

In JupyterLab, a single task can be completed using mouse, keyboard, and touch screen inputs. This works even when completing a single task. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab can open a new notebook from the launcher with mouse, keyboard, and touch screen inputs.

Proposed step-by-step

  1. Open default JupyterLab
  2. Open the file menu with a mouse click
  3. Navigate to menu New Launcher with arrow keys
  4. Use touch screen input to create new Notebook from Launcher
  5. Success if new notebook opens

I'm not totally sure how this works in terms of testing, but it is my understanding that the type of input can be simulated. Please let me know if I'm wrong about this.

@isabela-pf
Copy link
Contributor

I had a few other thoughts while working on this that I wanted to write down. Most of these are about testing patterns that may help us long term.

Categories of automated tests

Based on the content we are testing for accessibility, I think I'm seeing a pattern in certain approaches suiting different types of content. This isn't too well thought out yet, but I could see it as helping us know what kind of test we need for what.

  • Visual content (ie. layout changes, responsiveness, color) seems suited for tests that take and compare UI screenshots.
  • Interactive content/user interactions (ie. focus, navigation, editing) seem suited for tests that mirror their kind of task completion. These are tasks where you script actions to get to a designated "finish line."
  • Transforming content (ie. editing UI element sizes, color, font, or any settings configuration) is the one I'm least sure about, and may be the one that is hardest or not automatically-testable. I could imagine cases where you could also compare screenshots, but that doesn't seem like a true test of the functionality.
  • Audio content isn't something in JupyterLab by default, so I don't have thoughts on tests for this at the moment. (Same for haptic.)

What needs to be tested

In my above comment and in last week's team update meeting, we talked a lot about what needs to be tested. Scoping tests is important for both our small team and for respecting contributor time (as has been mentioned in other issues). Between this work, I also think certain accessibility needs fall under certain categories of how they can be optimally tested.

I broke this up into tests that need to run an entire "page," tests that do not need to run an entire "page," and tests that would benefit being run on a section of a page or single UI component.

  • Some tests do need to run the entire "page" (as in JupyterLab layout in full, no matter the mode).
    • For example page titled will fail in any test not utilizing the whole "page."
    • Focus order similarly doesn't make as much sense on a less-than-"page" test.
    • Orientation is another example.
  • Some tests do not need to run the whole page
    • For example, non-text content can and often is run over a whole page, but you could just as well test individual parts if you found it more helpful or less taxing on the testing suite.
    • In fact, this may give us more specific and actionable feedback on these items, because we would know exactly where it fails, not just that it does. This could be a part of avoiding duplicated failures as well.
  • Some tests would benefit being run on a section of a page or single UI component
    • For example, I think testing a whole page for keyboard traps is possible but complicated. Because a keyboard trap seems most likely to happen on a region by region basis (ie. the trap in JupyterLab's menu bar, but not in the left side bar), it is easier to test that way. Plus, then you know where is trapping and where is not.
    • I like this most if we run the same or equivalent tests over different, isolated UI regions to closely identify where they fail and succeed.
    • Using the keyboard trap example again, I could find it helpful to get results that all keyboard trap tests passed except for the menu bar one (rather than getting a general test over the whole page telling me there is a keyboard trap somewhere and knowing no region past that trap was able to be tested).

@gabalafou
Copy link
Contributor

I love this!

I'm going to jot down a few quick thoughts before our team meeting:

  • For the machine-as-a-user tests, are they in priority order? Ideally, we will implement all of them during this project cycle, but if we can't, it will be good to know which ones we should start with.

  • Eventually, we will likely want to write several Playwright tests for a given criteria, and some tests might relate to more than one WCAG guideline. For example, the test plan that you wrote under "no keyboard traps," might more properly be thought of as: no keyboard trap in the menu bar. And we'll probably want to have another test like "no keyboard trap in the notebook" etc. In database-speak, instead of a one-to-one relationship between a test and a WCAG guideline, we'll want a one-to-many or a many-to-many relationship, I think—as in several tests might map to the same guideline, plus (although I'm not sure how often this one will happen) a single test might map to several guidelines.

  • I'm starting to sense that we're really going to want to be able to test chunks of the UI in isolation. This will require some work and exploration that I didn't anticipate for this cycle, but it might be important enough to figure out how to do this now and work into this cycle somehow.

@isabela-pf
Copy link
Contributor

For the machine-as-a-user tests, are they in priority order?

Nope, just order of appearance in WCAG. I think we have a number of obstacles to overcome in order to get any handwritten tests like this started, and at this phase I'd rather focus on that infrastructure. Whichever test works the best with the rest of the work is what I'd prioritize for now.

we'll want a one-to-many or a many-to-many relationship, I think—as in several tests might map to the same guideline, plus (although I'm not sure how often this one will happen) a single test might map to several guidelines.

Agreed! Because I don't see it in any other notes, this may also end up fitting nicely with the work already done at ACT rules.

require some work and exploration that I didn't anticipate for this cycle

I personally do not see the isolated UI as high priority. I do think long term it is ideal, but I think we're early enough in this process that I don't consider it a blocker or something worth changing your current goals for. I pointed it out only because I wanted to gauge how you all felt about that/just to generally share my thoughts so this is done as openly as possible.


Other notes:

  • Be sure if/when testing JupyterLab with an open document (probably notebook), we are communicating clearly that these tests are not focused on testing document content. Don't take the reports they produce as an gauge of the notebook's accessibility.
  • @trallard linked a quick guide to automated test scripts.

@isabela-pf
Copy link
Contributor

What's changed

There is repeated content from above just so all the info stays together. Because I know this can feel like a lot at once, I want to point out what changed.

  • I updated the script for 1.3.4 Orientation to be more specific and avoid the screen shot technique.
  • All scripts now have an "expected behavior" per step.
  • Added new questions at the bottom.

@gabalafou @tonyfast @trallard I'm @/ing you for review.

Test proposals

1.3.4 - Orientation

Proposed JLab success criteria

JupyterLab is responsive. When switched to portrait orientation or viewed on mobile, no UI content is lost.

Proposed testing script

Step Expected Behavior
1. Open default JupyterLab JupyterLab opens with unmodified workspace.
2. Set viewport orientation to portrait (And/or mobile viewport?) JupyterLab accepts the orientation change and doesn't error out.
3. Check menu bar is in expected location Menu bar is at the top of the page and have all menu items visible (currently it has a scroll bar).
4. Check left side bar is in expected location Left side bar is the leftmost part of the viewport . It stretches from the menu bar to status bar. All icons are visible.
5. Check document area is in expected location Document area is the center and majority of the viewport.
6. Check document area toolbar is in expected location The document area toolbar is at the top of the document area. All items are visible (currently it has a scroll bar).
7. Check right side bar is in expected location Right side bar is the rightmost part of the viewport. It stretches from the menu bar to status bar. All icons are visible. (Right now, I believe this side bar is not able to be accessed in this mode.)
8. Check status bar is in expected location The status bar is at the bottom of the page. All information is visible.
9. Success if all main regions are in expected location

2.1.2 No keyboard trap

Proposed JLab success criteria

Focusable areas in JupyterLab can all be unfocused. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab's menu bar can be focused and unfocused.

Proposed testing script

Step Expected Behavior
1. Open default JupyterLab JupyterLab opens with unmodified workspace.
2. Start focus at top of tree Focus goes to JupyterLab tab, may hit skip link.
3. Tab into menu bar Focus goes to menu bar (whole).
4. Open file menu Focus goes to File menu (within menu bar). Menu bar opens full list of menu items.
5. Close file menu Focus stays on File menu, but menu bar is closed.
6. Tab out of menu bar Focus moves from File menu, to other menu items until it leaves the region. Focus will move the left side bar/file browser.
7. Success if focus switches to side bar/file browser

2.4.3 Focus Order

Proposed JLab success criteria

In JupyterLab, areas can be focused in the following order:

  1. Skip link
  2. Menu bar
  3. Left side bar
  4. Inside left side bar (selected section)
  5. Top of document area (document toolbar first if it has one)
  6. Document (if there is no toolbar for the document type, users go immediately into the document)
  7. Right side bar
  8. Inside right side bar (selected section)
  9. Status bar

Proposed testing script

Step Expected Behavior
1. Open default JupyterLab JupyterLab opens with unmodified workspace.
2. Tab to focus menu bar Tab until focus is on the menu bar. (Will this run into the skip link?)
3. Tab through major regions as needed (see above section) Tab to move focus through left side bar, inside left side bar, top of document area, document area, right side bar, and inside right side bar.
4. Tab to focus status bar Focus moves to status bar.
5. Success if tab brings focus to status bar

2.5.6 Concurrent input mechanisms

Proposed JLab success criteria

In JupyterLab, a single task can be completed using mouse, keyboard, and touch screen inputs. This works even when completing a single task. This will need to test multiple regions long term.

For now, I think our success criteria should be that JupyterLab can open a new notebook from the launcher with mouse, keyboard, and touch screen inputs.

Proposed testing script

Step Expected Behavior
1. Open default JupyterLab JupyterLab opens with unmodified workspace.
2. Open the file menu with a mouse click File menu opens and full list of menu items appears.
3. Navigate to menu item New Launcher with arrow keys Focus moves through the File menu list items until it reaches the New Launcher item.
4. Use touch screen input to create new Notebook from Launcher The New Notebook from Launcher is selected and the command is initiated.
5. Success if new notebook opens

Questions!

  • 1.3.4 Orientation is about not only being able to change the orientation, but to also not lose information (UI in this case) when switched. Should something about checking area content be added as additional steps, or is there a better way to do this?

@trallard
Copy link
Member Author

@isabela-pf
Copy link
Contributor

isabela-pf commented May 12, 2022

Based on synchronous feedback, this is close to complete (for this first round of test development).

  • I need to document a template for these testing scripts so that other people may contribute testing scripts in the future.

Responding to @trallard's comment, I think any of these scenarios can be tested first based on what's easiest for development to start with. If it's helpful for me to choose, I think 2.1.2 No keyboard trap is the easiest to start with because

  • it can actually pass in JupyterLab now (compared to focus order, which will 100% fail right now)
  • it can run well with a full "page" of JupyterLab and doesn't need a subset of the UI
  • it's not blocked by any other tests

@trallard
Copy link
Member Author

@isabela-pf to add the scripts as a PR to the accessibility repo

@isabela-pf
Copy link
Contributor

Re: myself

Current testing script proposals and template are in Quansight-Labs/accessibility #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: in progress 🏗 Currently being worked on type: deliverable 📦 Marks an issue as a deliverable for the grant type: scoping 🔎 Used to scope a project, implementation or evaluate steps moving forward
Projects
Status: Done 💪🏾
Development

No branches or pull requests

3 participants