Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: try to speed up Github workflows #12090

Merged
merged 6 commits into from
Jan 7, 2021

Conversation

ktmud
Copy link
Member

@ktmud ktmud commented Dec 17, 2020

SUMMARY

CI has been slow recently. This PR is the latest effort in trying to mitigate:

  1. Update the Cancel Previous Run job added in build: cancel previous github action runs #11940 to cancel future jobs as well.

  2. Skip dependabot branches when possible (e.g. don't run Python jobs for npm updates).

  3. Merge lighter actions (Prefer Typescript and License Check) into one workflow (there are overheads in starting a job).

  4. Add a script to allow committers to manually cancel CI jobs in queue. This is useful on a busy day when the CI pipeline got clogged by continuous new commits to open PRs. Users have to configure GITHUB_TOKEN in their env variables in order to use this.

    (superset) ➜  incubator-superset git:(manage-github-workflow) ✗ ./scripts/cancel_github_workflows.py --help 
    Usage: cancel_github_workflows.py [OPTIONS] BRANCH_OR_PULL
    
      Cancel running or queued GitHub workflows by branch or pull request ID.
    
      By default jobs that are already running or are the latest of the branch
      or PR will not be cancelled.
    
    Options:
      --repo TEXT                     Default is apache/incubator-superset
      --event [pull_request|push|issue]
      --include-last / --no-include-last
                                      Whether to also cancel the lastest run.
      --include-running / --no-include-running
                                      Whether to also cancel running workflows
      --help                          Show this message and exit.
    

    Snip20201204_53

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A

TEST PLAN

Test locally and CI

ADDITIONAL INFORMATION

  • Has associated issue: this is a rerun of build: try to speed up Github workflows #12089 because it was blocked by CI
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

@codecov-io
Copy link

codecov-io commented Dec 17, 2020

Codecov Report

Merging #12090 (85993fd) into master (eb3c2b2) will decrease coverage by 7.92%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #12090      +/-   ##
==========================================
- Coverage   67.14%   59.22%   -7.93%     
==========================================
  Files        1002      951      -51     
  Lines       49300    46658    -2642     
  Branches     5010     4312     -698     
==========================================
- Hits        33103    27633    -5470     
- Misses      16072    19025    +2953     
+ Partials      125        0     -125     
Flag Coverage Δ
cypress 51.44% <ø> (+<0.01%) ⬆️
javascript ?
python 63.59% <ø> (-0.57%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...uperset-frontend/src/dashboard/util/dnd-reorder.js 0.00% <0.00%> (-100.00%) ⬇️
...rset-frontend/src/dashboard/util/getEmptyLayout.js 0.00% <0.00%> (-100.00%) ⬇️
...dashboard/components/resizable/ResizableHandle.jsx 0.00% <0.00%> (-100.00%) ⬇️
.../src/dashboard/util/getFilterScopeFromNodesTree.js 0.00% <0.00%> (-93.48%) ⬇️
...set-frontend/src/views/CRUD/alert/ExecutionLog.tsx 11.76% <0.00%> (-88.24%) ⬇️
...src/dashboard/components/gridComponents/Header.jsx 10.52% <0.00%> (-86.85%) ⬇️
superset-frontend/src/components/IconTooltip.tsx 13.33% <0.00%> (-86.67%) ⬇️
...rc/dashboard/components/gridComponents/Divider.jsx 13.33% <0.00%> (-86.67%) ⬇️
superset/db_engines/hive.py 0.00% <0.00%> (-85.72%) ⬇️
...end/src/SqlLab/components/ExploreResultsButton.jsx 8.00% <0.00%> (-84.00%) ⬇️
... and 378 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eb3c2b2...85993fd. Read the comment docs.

@ktmud ktmud force-pushed the manage-github-workflow branch 2 times, most recently from 03494b0 to 1c4aba0 Compare January 4, 2021 18:36
@pull-request-size pull-request-size bot added size/L and removed size/XL labels Jan 4, 2021
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Jan 5, 2021
@ktmud ktmud closed this Jan 5, 2021
@ktmud ktmud reopened this Jan 5, 2021
@ktmud ktmud force-pushed the manage-github-workflow branch 3 times, most recently from 35c8d68 to 29df9f4 Compare January 6, 2021 00:06
Add a script to cancel previous GitHub workflows and optimize
the workflows in hope to speed up CI.

- Skip dependabot for python jobs
- Merge Prefer TypeScript and License Check
- Cancel all tests in one workflow
continue-on-error: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is essentially useless since the comments will never work for PRs coming from forks (due to the GITHUB_TOKEN access). This would be a good candidate to port over to the workflow_run approach mentioned in https://securitylab.github.com/research/github-actions-preventing-pwn-requests. Obviously, out of scope for this PR

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few non-blocking comments, LGTM.

from typing import Iterable, List, Optional, Union

import click
import requests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever possible, we should probably try to use urllib3 over requests to avoid licensing issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it using Apache 2.0 License as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, they've apparently relicensed. There was a refactor effort some time ago to remove explicit dependency on requests as it wasn't Apache compatible back then (see #7643). Good to know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I can't find any reference of it ever having been anything but Apache 2.0. Apparently I'm misremembering something or this has been a major misunderstanding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't let this one go.. 😄 I read old mailing list archives and turned up this: https://lists.apache.org/thread.html/49c4d03d3aadab5c2d1c20ccf1cb7ca02edbc94961e6429dbf6a082b%40%3Cdev.superset.apache.org%3E Turns out requests was pulling in chardet which is LGPL, which was the problem. Turns out chardet was dropped as of 09/2020: encode/httpx#1269 🎉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it means it's OK to use requests again, I guess?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, I went down a link in this issue without noticing that it was pointing to another project: psf/requests#4848 Apparently requests still uses chardet, and there are no plans to remove it. As long as this is the case we're not allowed to reference requests as a required dependency, but it's ok to be had as an optional dependency. As long as we restrict the usage to development scripts we're fine, but we shouldn't use it in the core code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for the clarification and investigation!

@@ -0,0 +1,141 @@
# Python MySQL unit tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly indifferent to having these split up versus all in one file, but I'd like to avoid shuffling these around unnecessarily to avoid losing the git history.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a hypothesis that there are overheads involved in scheduling multiple workflows vs running them in one. Not sure how much it helps but I still think it makes sense to combine these three considering they are very similar in nature and not particularly expensive to run.

@ktmud ktmud merged commit f482849 into apache:master Jan 7, 2021
@ktmud ktmud deleted the manage-github-workflow branch January 7, 2021 08:47
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.0.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 1.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants