Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_archive reporting 404 Not Found instead of 302 Redirect #14866

Closed
tali opened this issue Feb 18, 2022 · 4 comments
Closed

http_archive reporting 404 Not Found instead of 302 Redirect #14866

tali opened this issue Feb 18, 2022 · 4 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@tali
Copy link

tali commented Feb 18, 2022

Description of the problem:

Bazel http_archive() does not follow a HTTPS redirect.
It complains about a 404 File not found while wget sees a 302 Found and then follows to the redirected URL.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Try to build Gerrit when not all resources are in the cache.

What operating system are you running Bazel on?

Debian Linux (testing), but seeing the same problem on MacOS.

What's the output of bazel info release?

INFO: Invocation ID: 0439b56f-c214-4841-8aa1-34bd32721cc8
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=169
INFO: Reading rc options for 'info' from /home/martin/src/gerrit/.bazelrc:
  Inherited 'build' options: --workspace_status_command=python3 ./tools/workspace_status.py --repository_cache=~/.gerritcodereview/bazel-cache/repository --action_env=PATH --disk_cache=~/.gerritcodereview/bazel-cache/cas --java_language_version=11 --java_runtime_version=remotejdk_11 --tool_java_language_version=11 --tool_java_runtime_version=remotejdk_11 --incompatible_strict_action_env --announce_rc
release 5.0.0

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

https://gerrit.googlesource.com/gerrit
547c9e46b9c81fb0192e82dc73d085cfa182cd88
547c9e46b9c81fb0192e82dc73d085cfa182cd88

Have you found anything relevant by searching the web?

Similar bug, but http->https redirect:

Any other information, logs, or outputs that you want to share?

$ bazel build //:gerrit
INFO: Invocation ID: c07bac0d-a149-4de0-968c-65b179dc2364
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=169
INFO: Reading rc options for 'build' from /home/martin/src/gerrit/.bazelrc:
  'build' options: --workspace_status_command=python3 ./tools/workspace_status.py --repository_cache=~/.gerritcodereview/bazel-cache/repository --action_env=PATH --disk_cache=~/.gerritcodereview/bazel-cache/cas --java_language_version=11 --java_runtime_version=remotejdk_11 --tool_java_language_version=11 --tool_java_runtime_version=remotejdk_11 --incompatible_strict_action_env --announce_rc
INFO: Repository com_google_protobuf instantiated at:
  /home/martin/src/gerrit/WORKSPACE:46:13: in <toplevel>
Repository rule http_archive defined at:
  /home/martin/.cache/bazel/_bazel_martin/c71cc174adfe80e040fccc5ca27de62e/external/bazel_tools/tools/build_defs/repo/http.bzl:364:31: in <toplevel>
WARNING: Download from https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
ERROR: An error occurred during the fetch of repository 'com_google_protobuf':
   Traceback (most recent call last):
        File "/home/martin/.cache/bazel/_bazel_martin/c71cc174adfe80e040fccc5ca27de62e/external/bazel_tools/tools/build_defs/repo/http.bzl", line 111, column 45, in _http_archive_impl
                download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error downloading [https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz] to /home/martin/.cache/bazel/_bazel_martin/c71cc174adfe80e040fccc5ca27de62e/external/com_google_protobuf/temp14078988293593579239/v3.19.4.tar.gz: GET returned 404 Not Found
ERROR: /home/martin/src/gerrit/WORKSPACE:46:13: fetching http_archive rule //external:com_google_protobuf: Traceback (most recent call last):
        File "/home/martin/.cache/bazel/_bazel_martin/c71cc174adfe80e040fccc5ca27de62e/external/bazel_tools/tools/build_defs/repo/http.bzl", line 111, column 45, in _http_archive_impl
                download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error downloading [https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz] to /home/martin/.cache/bazel/_bazel_martin/c71cc174adfe80e040fccc5ca27de62e/external/com_google_protobuf/temp14078988293593579239/v3.19.4.tar.gz: GET returned 404 Not Found
ERROR: no such package '@com_google_protobuf//': java.io.IOException: Error downloading [https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz] to /home/martin/.cache/bazel/_bazel_martin/c71cc174adfe80e040fccc5ca27de62e/external/com_google_protobuf/temp14078988293593579239/v3.19.4.tar.gz: GET returned 404 Not Found
INFO: Elapsed time: 0.346s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)

$ wget https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz
--2022-02-18 11:01:56--  https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/protocolbuffers/protobuf/tar.gz/refs/tags/v3.19.4 [following]
--2022-02-18 11:01:56--  https://codeload.github.com/protocolbuffers/protobuf/tar.gz/refs/tags/v3.19.4
Resolving codeload.github.com (codeload.github.com)... 140.82.121.10
Connecting to codeload.github.com (codeload.github.com)|140.82.121.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5293745 (5.0M) [application/x-gzip]
Saving to: 'v3.19.4.tar.gz'

v3.19.4.tar.gz                             100%[=====================================================================================>]   5.05M  2.91MB/s    in 1.7s    

2022-02-18 11:01:58 (2.91 MB/s) - 'v3.19.4.tar.gz' saved [5293745/5293745]

$ curl -v -o /dev/null https://github.com/protocolbuffers/protobuf/archive/v3.19.4.tar.gz
> GET /protocolbuffers/protobuf/archive/v3.19.4.tar.gz HTTP/2
> Host: github.com   
> user-agent: curl/7.81.0
> accept: */*
>
< HTTP/2 302
< server: GitHub.com 
< date: Fri, 18 Feb 2022 10:12:17 GMT
< content-type: text/html; charset=utf-8
< vary: X-PJAX, X-PJAX-Container, Accept-Encoding, Accept, X-Requested-With
< permissions-policy: interest-cohort=()
< location: https://codeload.github.com/protocolbuffers/protobuf/tar.gz/refs/tags/v3.19.4
< cache-control: max-age=0, private
< strict-transport-security: max-age=31536000; includeSubdomains; preload
< x-frame-options: deny
< x-content-type-options: nosniff
< x-xss-protection: 0
< referrer-policy: no-referrer-when-downgrade
< expect-ct: max-age=2592000, report-uri="https://api.github.com/_private/browser/errors"
< content-security-policy: default-src 'none'; base-uri 'self'; block-all-mixed-content; child-src github.com/assets-cdn/worker/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com objects-origin.githubusercontent.com www.githubstatus.com collector.githubapp.com collector.github.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com cdn.optimizely.com logx.optimizely.com/v1/events translator.github.com wss://alive.github.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com online.visualstudio.com/api/v1/locations raw.githubusercontent.com github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src render.githubusercontent.com viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: github.githubassets.com identicons.github.com collector.githubapp.com collector.github.com github-cloud.s3.amazonaws.com secured-user-images.githubusercontent.com/ *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; worker-src github.com/assets-cdn/worker/ gist.github.com/assets-cdn/worker/
< content-length: 143
< x-github-request-id: C51C:0BA9:5D1019:64573C:620F7101
<
@tali
Copy link
Author

tali commented Feb 18, 2022

This problem seems to be related to my ~/.netrc file. When I remove it, then Bazel happily fetches all resources.

@Bencodes
Copy link
Contributor

We just ran into this issue as well and can also confirm that removing the netrc file allows http_archive to correctly follow the http-302 redirects.

@aiuto aiuto added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. untriaged labels Feb 26, 2022
@Bencodes
Copy link
Contributor

Bencodes commented Mar 1, 2022

I was able to root cause this and put up a fix here #14922. Bazel does actually follow 302s but it was breaking the encoding for the redirect-url in very subtle ways that may not produce a 200 when requested.

@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) type: bug and removed untriaged labels Mar 1, 2022
brentleyjones pushed a commit to brentleyjones/bazel that referenced this issue Mar 4, 2022
`mergeUrls` does not need to rebuild the URL from scratch if user information exists on the original URL. This behavior can actually break the 302 redirect due to subtle changes in the URL/encoding and should be avoided when possible.

This fixes bazelbuild#14866 by correcting the implementation of `mergeUrls` to match the documentation that was added instead of rebuilding the URL from scratch which breaks the encoding of signed URLs.

Closes bazelbuild#14922.

PiperOrigin-RevId: 431935885
(cherry picked from commit 8cefb8b)
Wyverald pushed a commit that referenced this issue Mar 4, 2022
`mergeUrls` does not need to rebuild the URL from scratch if user information exists on the original URL. This behavior can actually break the 302 redirect due to subtle changes in the URL/encoding and should be avoided when possible.

This fixes #14866 by correcting the implementation of `mergeUrls` to match the documentation that was added instead of rebuilding the URL from scratch which breaks the encoding of signed URLs.

Closes #14922.

PiperOrigin-RevId: 431935885
(cherry picked from commit 8cefb8b)

Co-authored-by: Benjamin Lee <ben@ben.cm>
@zirain
Copy link

zirain commented Aug 28, 2023

This problem seems to be related to my ~/.netrc file. When I remove it, then Bazel happily fetches all resources.

Thanks bro, this save my life.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants