Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent POST .../api/v0/available_products: EOF when uploading tile #240

Closed
ljfranklin opened this issue Aug 30, 2018 · 8 comments
Closed

Comments

@ljfranklin
Copy link
Contributor

We've seen the following error 22 times in the last 45 days:

processing product
beginning product upload to Ops Manager
 10.64 GiB / 12.60 GiB [==================================>------]  84.44% 1m34s
could not execute "upload-product": failed to upload product: could not make api request to available_products endpoint: Post https://pcf-optional.abalone-scale.gcp.releng.cf-app.com/api/v0/available_products: EOF

We've seen this across multiple tiles (PAS & PASW) and different IaaS (GCP, Azure, AWS) and different OpsMgr versions (2.0-2.3). We initially suspected a recent PR which made file uploads more performant but we saw the error a few times prior to merging the PR.

One potential solution would be to retry on Temporary() networking errors. The stdlib networking packages often return net.Error which has a Temporary() method you can use to check whether an error instance might be retryable: https://golang.org/pkg/net/#Error. We could add retry logic for these errors in the http client in om: https://github.com/pivotal-cf/om/blob/4d5f262bb6a1006f1e2af2754ee4e24707b5e4f3/network/unauthenticated_client.go. We suspect EOF is a temporary error but we're not positive.

@cf-gitbot
Copy link
Member

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.

@ljfranklin
Copy link
Contributor Author

Merged in a PR to add retries on networking failures here, this feature is available in om v0.41.0. Going to optimistically close this out but will re-open if we still see the error.

@heycait
Copy link

heycait commented Sep 5, 2018

This is still an issue. Currently seeing it in IST2.0 and ERT next on GCP.

@heycait heycait reopened this Sep 5, 2018
@pivotal-cf pivotal-cf deleted a comment from cf-gitbot Sep 5, 2018
@fredwangwang
Copy link
Member

this is identified as a network issue, and we found the retry client probably wont work for uploading case since the payload would not be complete on the second try.

@ljfranklin are we able to close this issue?

@ljfranklin
Copy link
Contributor Author

@fredwangwang agreed the initial retry client PR doesn't help the issue. But we still see this issue frequently in our CI and would like to fix it. I moved our story up in our backlog to take another look: https://www.pivotaltracker.com/story/show/160181845

@ljfranklin
Copy link
Contributor Author

Additional context in this slack convo: https://pivotal.slack.com/archives/C5V956L13/p1539014781000100

ljfranklin added a commit that referenced this issue Dec 4, 2018
[#160181845] pivotal-cf/om #240: Intermittent `POST .../api/v0/available_products: EOF` when uploading tile
ljfranklin added a commit that referenced this issue Dec 4, 2018
[#160181845] pivotal-cf/om #240: Intermittent `POST .../api/v0/available_products: EOF` when uploading tile
ljfranklin added a commit that referenced this issue Dec 10, 2018
[#160181845] pivotal-cf/om #240: Intermittent `POST .../api/v0/available_products: EOF` when uploading tile
jtarchie pushed a commit that referenced this issue Dec 19, 2018
[#160181845] pivotal-cf/om #240: Intermittent `POST .../api/v0/available_products: EOF` when uploading tile
jtarchie pushed a commit that referenced this issue Dec 19, 2018
[#160181845] pivotal-cf/om #240: Intermittent `POST .../api/v0/available_products: EOF` when uploading tile
jtarchie pushed a commit that referenced this issue Dec 19, 2018
[#160181845] pivotal-cf/om #240: Intermittent `POST .../api/v0/available_products: EOF` when uploading tile
@eitansuez
Copy link

perhaps it's good that retry logic was added. but what if it fails three times in a row?

@jtarchie
Copy link
Contributor

@eitansuez: You have to retry. At some point we have to call it on the number things we can attempt to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants