Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many UNKNOWN/ERROR type failures for Skaffold events #4692

Closed
nelango opened this issue Aug 18, 2020 · 7 comments
Closed

Too many UNKNOWN/ERROR type failures for Skaffold events #4692

nelango opened this issue Aug 18, 2020 · 7 comments
Assignees
Labels
area/errors Epic internal kind/todo implementation task/epic for the skaffold team priority/p0 Highest priority. We are actively looking at delivering it.

Comments

@nelango
Copy link

nelango commented Aug 18, 2020

Large number of skaffold event errors are being reported as UKNOWN/ERROR. We need more descriptive error codes for Cloud Code metrics tracking and investigations.

Based on this dashboard https://dashboards.corp.google.com/edit/_dcb794d1_3a93_4405_8560_8230e2e5325d, we see:
image

On vscode side, we log skaffold version for cloudcode.skaffold.session.end event. So filtering based on versions:

68% error codes from v1.13.1 are UKNOWN/ERROR
image

43% error codes from v1.13.0 are UKNOWN/ERROR
image

@nelango
Copy link
Author

nelango commented Aug 18, 2020

@briandealwis @tejal29 @sivakku as FYI

@tejal29
Copy link
Member

tejal29 commented Aug 24, 2020

Thanks @go-nelango for opening this issue.

@tejal29 tejal29 added area/errors internal priority/p0 Highest priority. We are actively looking at delivering it. kind/todo implementation task/epic for the skaffold team labels Aug 24, 2020
@nkubala nkubala added this to the Backlog milestone Sep 1, 2020
@tejal29 tejal29 modified the milestones: Backlog [P0/P1], v1.16.0 Oct 8, 2020
@tejal29
Copy link
Member

tejal29 commented Oct 13, 2020

We have an issue #4645 which @PriyaModali is working on to reduce Deploy_UNKNOWN.
@go-nelango, can you add some data on

  1. BUILD_UNKNOWN by builders?
  2. DEPLOY_UNKNOWN by deployers?

@tejal29
Copy link
Member

tejal29 commented Oct 14, 2020

Handled BUILD_UNKONWN -> docker connectivity issues here #4914

@nelango
Copy link
Author

nelango commented Oct 15, 2020

Please note charts below are based on IntelliJ metrics from 9/1/2020 to 9/30/2020 with skaffold version v1.13.2 and higher.
Vscode is currently not reporting Skaffold Metaevent and we are fixing it.

  1. BUILD_UNKNOWN by builders
    image

2.DEPLOY_UNKNOWN by deployers
image

@tejal29 tejal29 self-assigned this Oct 19, 2020
@tejal29 tejal29 added Epic kind/todo implementation task/epic for the skaffold team and removed kind/todo implementation task/epic for the skaffold team labels Oct 26, 2020
@nkubala nkubala modified the milestones: v1.16.0, v1.17.0 Nov 9, 2020
@briandealwis briandealwis modified the milestones: v1.17.0, v1.18.0 Nov 25, 2020
@tejal29 tejal29 removed this from the v1.18.0 milestone Jan 19, 2021
@tejal29
Copy link
Member

tejal29 commented Jan 19, 2021

We have now reduced unknowns to 23% of all failure that happen or 3% of total sessions

@tejal29
Copy link
Member

tejal29 commented Jan 19, 2021

Closing this in favor of #5246

@tejal29 tejal29 closed this as completed Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/errors Epic internal kind/todo implementation task/epic for the skaffold team priority/p0 Highest priority. We are actively looking at delivering it.
Projects
None yet
Development

No branches or pull requests

4 participants