Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workflow-controller panics and crashes with empty workflow-controller-configmap #2066

Closed
4 tasks done
duboisf opened this issue Jan 27, 2020 · 3 comments
Closed
4 tasks done
Assignees
Labels
Milestone

Comments

@duboisf
Copy link
Contributor

duboisf commented Jan 27, 2020

Checklist:

  • I've included the version.
  • I've included reproduction steps.
  • I've included the workflow YAML.
  • I've included the logs.

What happened:

The workflow controller panics (and crashes) when running the hello-world workflow when following the getting started instructions with argo 2.5.0-rc2. IMHO it should support an empty configmap, falling back to sensible defaults.

How to reproduce it (as minimally and precisely as possible):

$ minikube version
minikube version: v1.6.2
commit: 54f28ac5d3a815d1196cd5d57d707439ee4bb392
$ minikube start
(new kube 1.17 cluster gets created with kvm2 on linux)
$ kubectl create ns argo
$ kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/v2.5.0-rc1/manifests/install.yaml
$ argo submit --watch https://raw.githubusercontent.com/argoproj/argo/master/examples/hello-world.yaml

Anything else we need to know?:

The panic happens here. It's because the wfArchive of the WorkflowController is nil. It seems that the assumption is that the config key in the workflow-controller-configmap should always exist and in this case it does not, so it never instantiates the wfArchive in the WorkflowController.updateConfig method.

Environment:

  • Argo version:
    This is argo v2.5.0-rc2, seems the version info of the downloaded binary is empty:
$ argo version
argo: v0.0.0+unknown
  BuildDate: 1970-01-01T00:00:00Z
  GitCommit: 
  GitTreeState: 
  GoVersion: go1.13.4
  Compiler: gc
  Platform: linux/amd64
  • Kubernetes version :
$ kubectl version -o yaml
clientVersion:
  buildDate: "2019-12-07T21:20:10Z"
  compiler: gc
  gitCommit: 70132b0f130acc0bed193d9ba59dd186f0e634cf
  gitTreeState: clean
  gitVersion: v1.17.0
  goVersion: go1.13.4
  major: "1"
  minor: "17"
  platform: linux/amd64
serverVersion:
  buildDate: "2019-12-07T21:12:17Z"
  compiler: gc
  gitCommit: 70132b0f130acc0bed193d9ba59dd186f0e634cf
  gitTreeState: clean
  gitVersion: v1.17.0
  goVersion: go1.13.4
  major: "1"
  minor: "17"
  platform: linux/amd64

Other debugging information (if applicable):

  • workflow result:
argo get <workflowname>
Name:                hello-world-x6rcl
Namespace:           argo
ServiceAccount:      default
Status:              Error
Message:             runtime error: invalid memory address or nil pointer dereference
Created:             Mon Jan 27 08:05:37 -0500 (3 seconds ago)
Started:             Mon Jan 27 08:05:37 -0500 (3 seconds ago)
Finished:            Mon Jan 27 08:05:40 -0500 (now)
Duration:            3 seconds

STEP                             PODNAME            DURATION  MESSAGE
 ✔ hello-world-x6rcl (whalesay)  hello-world-x6rcl  2s
  • executor logs:
$ kubectl logs hello-world-x6rcl -c main
 _____________                                                                                                                                                                                                                                                
< hello world >                                                                                                                                                                                                                                               
 -------------                                                                                                                                                                                                                                                
    \                                                                                                                                                                                                                                                         
     \                                                                                                                                                                                                                                                        
      \                                                                                                                                                                                                                                                       
                    ##        .                                                                                                                                                                                                                               
              ## ## ##       ==                                                                                                                                                                                                                               
           ## ## ## ##      ===            
       /""""""""""""""""___/ ===        
  ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ /  ===- ~~~   
       \______ o          __/            
        \    \        __/             
          \____\______/   

$ kubectl logs <failedpodname> -c wait
  • workflow-controller logs:
$ kubectl -p workflow-controller-794f4484df-7j7bk
time="2020-01-27T13:03:43Z" level=warning msg="ConfigMap 'workflow-controller-configmap' does not have key 'config'"
time="2020-01-27T13:03:43Z" level=info msg="Starting CronWorkflow controller"
time="2020-01-27T13:03:43Z" level=info msg="Workflow Controller (version: v0.0.0+unknown) starting"
time="2020-01-27T13:03:43Z" level=info msg="Workers: workflow: 8, pod: 8"
time="2020-01-27T13:03:43Z" level=info msg="Watch Workflow controller config map updates"
time="2020-01-27T13:03:43Z" level=info msg="Performing periodic GC every 5m0s"
time="2020-01-27T13:03:43Z" level=info msg="Starting workflow TTL controller (resync 20m0s)"
time="2020-01-27T13:03:43Z" level=info msg="Detected ConfigMap update. Updating the controller config."
time="2020-01-27T13:03:43Z" level=warning msg="ConfigMap 'workflow-controller-configmap' does not have key 'config'"
time="2020-01-27T13:03:43Z" level=info msg="Started workflow TTL worker"
time="2020-01-27T13:05:37Z" level=info msg="Processing workflow" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:37Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:37Z" level=info msg="Pod node {hello-world-x6rcl hello-world-x6rcl hello-world-x6rcl Pod whalesay nil    Pending   2020-01-27 13:05:37.815442569 +0000 UTC 0001-01-01 00:00:00 +0000 UTC  <nil> nil nil [] []} initialized Pending" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:37Z" level=info msg="Created pod: hello-world-x6rcl (hello-world-x6rcl)" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:37Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=6461 workflow=hello-world-x6rcl
time="2020-01-27T13:05:38Z" level=info msg="Processing workflow" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:38Z" level=info msg="Updating node &NodeStatus{ID:hello-world-x6rcl,Name:hello-world-x6rcl,DisplayName:hello-world-x6rcl,Type:Pod,TemplateName:whalesay,TemplateRef:nil,Phase:Pending,BoundaryID:,Message:,StartedAt:2020-01-27 13:05:37 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:,} message: ContainerCreating"
time="2020-01-27T13:05:38Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=6471 workflow=hello-world-x6rcl
time="2020-01-27T13:05:39Z" level=info msg="Processing workflow" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:39Z" level=info msg="Updating node &NodeStatus{ID:hello-world-x6rcl,Name:hello-world-x6rcl,DisplayName:hello-world-x6rcl,Type:Pod,TemplateName:whalesay,TemplateRef:nil,Phase:Pending,BoundaryID:,Message:ContainerCreating,StartedAt:2020-01-27 13:05:37 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:,} status Pending -> Running"
time="2020-01-27T13:05:39Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=6476 workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Processing workflow" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Updating node &NodeStatus{ID:hello-world-x6rcl,Name:hello-world-x6rcl,DisplayName:hello-world-x6rcl,Type:Pod,TemplateName:whalesay,TemplateRef:nil,Phase:Running,BoundaryID:,Message:,StartedAt:2020-01-27 13:05:37 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:,} status Running -> Succeeded"
time="2020-01-27T13:05:40Z" level=info msg="Updated phase Running -> Succeeded" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Marking workflow completed" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Updated phase Succeeded -> Error" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Updated message  -> runtime error: invalid memory address or nil pointer dereference" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Marking workflow completed" namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Checking daemoned children of " namespace=argo workflow=hello-world-x6rcl
time="2020-01-27T13:05:40Z" level=info msg="Workflow update successful" namespace=argo phase=Error resourceVersion=6482 workflow=hello-world-x6rcl
E0127 13:05:41.929048       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 146 [running]:
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x16997e0, 0x2a05180)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xaa
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82
panic(0x16997e0, 0x2a05180)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowPhase(0xc000463540, 0x18a05f8, 0x5, 0xc000616a01, 0xc000616a30, 0x1, 0x1)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1390 +0x56a
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowError(0xc000463540, 0x1a9a980, 0x2a05180, 0x1a9a901)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1421 +0x99
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).operate.func2(0xc000463540)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:151 +0xb1
panic(0x16997e0, 0x2a05180)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowPhase(0xc000463540, 0x18a5f5c, 0x9, 0x201, 0x0, 0x0, 0x0)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1390 +0x56a
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowSuccess(0xc000463540)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1413 +0x56
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).operate(0xc000463540)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:303 +0x13dd
github.com/argoproj/argo/workflow/controller.(*WorkflowController).processNextItem(0xc0004f4fc0, 0xc0005cc800)
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:366 +0x54d
github.com/argoproj/argo/workflow/controller.(*WorkflowController).runWorker(0xc0004f4fc0)
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:290 +0x2b
github.com/argoproj/argo/workflow/controller.(*WorkflowController).runWorker-fm()
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:184 +0x2a
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0005b1950)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0005b1950, 0x3b9aca00, 0x0, 0x195c201, 0xc0003845a0)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc0005b1950, 0x3b9aca00, 0xc0003845a0)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/argoproj/argo/workflow/controller.(*WorkflowController).Run
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:184 +0x7d3
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x14eacda]

goroutine 146 [running]:
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x108
panic(0x16997e0, 0x2a05180)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowPhase(0xc000463540, 0x18a05f8, 0x5, 0xc000616a01, 0xc000616a30, 0x1, 0x1)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1390 +0x56a
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowError(0xc000463540, 0x1a9a980, 0x2a05180, 0x1a9a901)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1421 +0x99
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).operate.func2(0xc000463540)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:151 +0xb1
panic(0x16997e0, 0x2a05180)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowPhase(0xc000463540, 0x18a5f5c, 0x9, 0x201, 0x0, 0x0, 0x0)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1390 +0x56a
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).markWorkflowSuccess(0xc000463540)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:1413 +0x56
github.com/argoproj/argo/workflow/controller.(*wfOperationCtx).operate(0xc000463540)
	/go/src/github.com/argoproj/argo/workflow/controller/operator.go:303 +0x13dd
github.com/argoproj/argo/workflow/controller.(*WorkflowController).processNextItem(0xc0004f4fc0, 0xc0005cc800)
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:366 +0x54d
github.com/argoproj/argo/workflow/controller.(*WorkflowController).runWorker(0xc0004f4fc0)
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:290 +0x2b
github.com/argoproj/argo/workflow/controller.(*WorkflowController).runWorker-fm()
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:184 +0x2a
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0005b1950)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0005b1950, 0x3b9aca00, 0x0, 0x195c201, 0xc0003845a0)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc0005b1950, 0x3b9aca00, 0xc0003845a0)
	/go/src/github.com/argoproj/argo/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/argoproj/argo/workflow/controller.(*WorkflowController).Run
	/go/src/github.com/argoproj/argo/workflow/controller/controller.go:184 +0x7d3


Message from the maintainers:

If you are impacted by this bug please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

duboisf added a commit to duboisf/argo that referenced this issue Jan 27, 2020
@simster7
Copy link
Member

This seems like a bug, @alexec?

@alexec alexec self-assigned this Jan 27, 2020
@alexec
Copy link
Contributor

alexec commented Jan 27, 2020

I'll take this one.

@simster7
Copy link
Member

Closed with #2070 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants