Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased rate of transient failures in recent weeks #15

Closed
jessebraham opened this issue Dec 12, 2022 · 8 comments · Fixed by #16 or #20
Closed

Increased rate of transient failures in recent weeks #15

jessebraham opened this issue Dec 12, 2022 · 8 comments · Fixed by #16 or #20
Labels
bug Something isn't working

Comments

@jessebraham
Copy link
Member

Over the last couples weeks I've noticed an increase in the number of transient failures of CI jobs using these workflow. See here for an example:
https://github.com/esp-rs/esp-hal/actions/runs/3676289852/jobs/6216757844

I'm not sure what has changed, but this needs investigating.

CC @SergioGasquez

@jessebraham jessebraham added the bug Something isn't working label Dec 12, 2022
@bjoernQ
Copy link

bjoernQ commented Dec 15, 2022

I have an idea what is going on

Apparently, the curl in

curl -s https://api.github.com/repos/esp-rs/rust-build/releases \
doesn't get a valid response (might be a victim to rate limiting since we don't pass a GITHUB_TOKEN, or it is some other network failure) - then jq will error

No idea how to solve that in a Bash script - in general might be worth to rewrite the action in TS/JS or even better in Rust

@jessebraham
Copy link
Member Author

Ahh good catch, thanks for the info! I've been wanting to re-write this action for awhile but just haven't gotten around to it, something I'll tackle in the new year I suppose.

@SergioGasquez
Copy link
Member

We are working on this issue at the moment. For reference its also seen in esp-idf-svc: esp-rs/esp-idf-svc#209

@bjoernQ
Copy link

bjoernQ commented Jan 10, 2023

I did a couple of tests and found a few issues. You can view the log of my latest test here: https://github.com/bjoernQ/workflow-test/actions/runs/3881586716/jobs/6620756700 and inspect my code here: https://github.com/bjoernQ/workflow-test/

Issues I found:

With all of this you will get the expected 1000 invocations rate-limit.

However, it's still not good to query the latest version in the script - that should only be done by espup (e.g. passing the literal latest as toolchain-version) since there we can have retries (maybe in a naive way or a bit more advanced by checking the rate-limit and the x-ratelimit-reset headers) and also, we don't waste an additional invocation that way

I think most of the things are already addressed in espup but probably not all (or they don't apply to the espup Rust code)

@SergioGasquez
Copy link
Member

SergioGasquez commented Jan 10, 2023

Here is what I've done in espup:

Now, I'll investigate if there is a way to share the GITHUB_TOKEN environment variable with this action, as otherwise, we will need to "materialize" it in every CI using this action

@bjoernQ
Copy link

bjoernQ commented Jan 10, 2023

Now, I'll investigate if there is a way to share the GITHUB_TOKEN environment variable with this action, as otherwise, we will need to "materialize" it in every CI using this action

I guess the best way would be to have an additional input to this action and require it (like here: https://docs.github.com/en/actions/creating-actions/metadata-syntax-for-github-actions#inputsinput_idrequired)

In the README we can give an example how to pass it (${{ secrets.GITHUB_TOKEN }})

@SergioGasquez
Copy link
Member

Few tests that I have done for the xtensa-toolchain action:

  • Having the GITHUB_TOKEN as an input of the action resulted in a failure of the CI (for some reason it was not able to fetch the latest version).
    • I'll try to investigate what's happening in here. The rust code is detecting some GITHUB_TOKEN environment variable as we can see this message on the log of the CI, but we also see that GITHUB_TOKEN is empty:
    env:
        GITHUB_TOKEN: 
    [2023-01-10T15:23:50Z DEBUG] connecting to crates.io:443 at 108.1[38](https://github.com/SergioGasquez/xtensa-toolchain/actions/runs/3884728772/jobs/6627672534#step:3:40).64.108:443
    [2023-01-10T15:23:50Z DEBUG] No cached session for DnsName(DnsName(DnsName("crates.io")))
    [2023-01-10T15:23:50Z DEBUG] Not resuming any session
    [2023-01-10T15:23:50Z DEBUG] Using ciphersuite TLS13_AES_128_GCM_SHA256
    [2023-01-10T15:23:50Z DEBUG] Not resuming
    [2023-01-10T15:23:50Z DEBUG] TLS1.3 encrypted extensions: [ServerNameAck]
    [2023-01-10T15:23:50Z DEBUG] ALPN protocol is None
    [2023-01-10T15:23:50Z DEBUG] created stream: Stream(RustlsStream)
    [2023-01-10T15:23:50Z DEBUG] sending request GET https://crates.io/api/v1/crates/espup/versions
    [2023-01-10T15:23:50Z DEBUG] writing prelude: GET /api/v1/crates/espup/versions HTTP/1.1
        Host: crates.io
        User-Agent: ureq/2.5.0
        Accept: */*
        accept-encoding: gzip
    [2023-01-10T15:23:50Z DEBUG] Ticket saved
    [2023-01-10T15:23:50Z DEBUG] Chunked body in response
    [2023-01-10T15:23:50Z DEBUG] response 200 to GET https://crates.io/api/v1/crates/espup/versions
    [2023-01-10T15:23:50Z DEBUG] dropping stream: Stream(RustlsStream)
    [2023-01-10T15:23:[50](https://github.com/SergioGasquez/xtensa-toolchain/actions/runs/3884728772/jobs/6627672534#step:3:53)Z INFO ] 💽  Installing esp-rs
    [2023-01-10T15:23:50Z INFO ] 💡  Querying GitHub API: 'https://api.github.com/repos/esp-rs/rust-build/releases/latest'
    [2023-01-10T15:23:50Z DEBUG] 🐞  Auth header added.
    [2023-01-10T15:23:50Z DEBUG] starting new connection: https://api.github.com/
    [2023-01-10T15:23:50Z DEBUG] 🐞  Parsing Xtensa Rust version: null
    [2023-01-10T15:23:50Z INFO ] 💡  Querying GitHub API: 'https://api.github.com/repos/esp-rs/rust-build/releases'
    [2023-01-10T15:23:50Z DEBUG] 🐞  Auth header added.
    [2023-01-10T15:23:50Z DEBUG] starting new connection: https://api.github.com/
    Error: espup::toolchain::rust::invalid_version
    
      × ⛔  Invalid toolchain version 'null'. Verify that the format is correct:
      │ '<major>.<minor>.<patch>.<subpatch>' or '<major>.<minor>.<patch>', and
      │ that the release exists in https://github.com/esp-rs/rust-build/releases
    with:
      default: true
      ldproxy: false
      githubtoken: ***
      buildtargets: all
      version: latest
      override: true
  • I've tested including exporting the GITHUB_TOKEN environment variable in the esp-hal CI (I'm using esp-hal as a repo to test the action) and some of the jobs are failing with this. Rerunning the jobs makes them succeed (dejavu)

@SergioGasquez
Copy link
Member

SergioGasquez commented Jan 11, 2023

Hi! Few updates on the issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants