Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Che-server fails to bootstrap on Kubernetes (kubeadm setup) #16767

Closed
4 of 23 tasks
Roshani30 opened this issue Apr 27, 2020 · 13 comments
Closed
4 of 23 tasks

Che-server fails to bootstrap on Kubernetes (kubeadm setup) #16767

Roshani30 opened this issue Apr 27, 2020 · 13 comments
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/bug Outline of a bug - must adhere to the bug report template. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. severity/P1 Has a major impact to usage or development of the system.

Comments

@Roshani30
Copy link

Roshani30 commented Apr 27, 2020

I am trying to install eclipse che on my kubernetes cluster but ii failing at "Che Pod Bootstrap"
Kubernetes: v1.18.1
helm: v3
chectl/7.12.0 linux-x64 node-v10.20.1

Describe the bug

Che version

  • latest
  • nightly
  • other: please specify

Steps to reproduce

Expected behavior

Runtime

  • kubernetes (include output of kubectl version)
  • Openshift (include output of oc version)
  • minikube (include output of minikube version and kubectl version)
  • minishift (include output of minishift version and oc version)
  • docker-desktop + K8S (include output of docker version and kubectl version)
  • other: (please specify)

Screenshots

Installation method

  • chectl
  • che-operator
  • minishift-addon
  • I don't know

Environment

  • my computer
    • Windows
    • Linux
    • macOS
  • Cloud
    • Amazon
    • Azure
    • GCE
    • other (please specify)
  • other: please specify

Eclipse Che Logs

Additional context

chectl server:start --platform=k8s --installer=operator --domain=10.101.52.51.nip.io --self-signed-cert
Set current context to 'kubernetes-admin@kubernetes'
  ✔ Verify Kubernetes API...OK
  ✔ 👀  Looking for an already existing Eclipse Che instance
    ✔ Verify if Eclipse Che is deployed into namespace "che"...it is not
  ✔ ✈️  Kubernetes preflight checklist
    ✔ Verify if kubectl is installed
    ✔ Verify remote kubernetes status...done.
    ✔ Check Kubernetes version: Found v1.18.2.
    ✔ Verify domain is set...set to 10.101.52.51.nip.io.
    ↓ Check if cluster accessible [skipped]
Eclipse Che logs will be available in '/tmp/chectl-logs/1587977910823'
  ✔ Start following logs
    ✔ Start following Operator logs...done
    ✔ Start following Eclipse Che logs...done
    ✔ Start following Postgres logs...done
    ✔ Start following Keycloak logs...done
    ✔ Start following Plugin registry logs...done
    ✔ Start following Devfile registry logs...done
  ✔ Start following events
    ✔ Start following namespace events...done
  ✔ 🏃‍  Running the Eclipse Che operator
    ✔ Copying operator resources...done.
    ✔ Create Namespace (che)...It already exists.
    ✔ Checking for pre-created TLS secret... "che-tls" secret found
    ↓ Checking certificate [skipped]
    ✔ Create ServiceAccount che-operator in namespace che...done.
    ✔ Create Role che-operator in namespace che...done.
    ✔ Create ClusterRole che-operator...It already exists.
    ✔ Create RoleBinding che-operator in namespace che...done.
    ✔ Create ClusterRoleBinding che-operator...It already exists.
    ✔ Create CRD checlusters.org.eclipse.che...It already exists.
    ✔ Waiting 5 seconds for the new Kubernetes resources to get flushed...done.
    ✔ Create deployment che-operator in namespace che...done.
    ✔ Create Eclipse Che cluster eclipse-che in namespace che...done.
  ❯ ✅  Post installation checklist
    ✔ PostgreSQL pod bootstrap
      ✔ scheduling...done.
      ✔ downloading images...done.
      ✔ starting...done.
    ❯ Keycloak pod bootstrap
      ✖ scheduling
        → ERR_TIMEOUT: Timeout set to pod wait timeout 300000. podExist: false, currentPhase: undefined
        downloading images
        starting
      Devfile registry pod bootstrap
      Plugin registry pod bootstrap
      Eclipse Che pod bootstrap
      Retrieving Eclipse Che server URL
      Eclipse Che status check
 ›   Error: Error: ERR_TIMEOUT: Timeout set to pod wait timeout 300000. podExist: false, currentPhase: undefined
 ›   Installation failed, check logs in '/tmp/chectl-logs/1587977910823'

========================================================
Deployment created under Che namespace

root@osboxes:/usr/local/bin/csi-driver-host-path# kubectl get all -n che
NAME READY STATUS RESTARTS AGE
pod/che-operator-59c9b8cb9b-qvq8m 1/1 Running 0 40m
pod/postgres-666949c4d4-vbrnz 1/1 Running 0 40m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/che-host ClusterIP 10.103.69.126 8080/TCP 39m
service/postgres ClusterIP 10.107.114.14 5432/TCP 40m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/che-operator 1/1 1 1 40m
deployment.apps/postgres 1/1 1 1 40m

NAME DESIRED CURRENT READY AGE
replicaset.apps/che-operator-59c9b8cb9b 1 1 1 40m
replicaset.apps/postgres-666949c4d4 1 1 1 40m

I am new for all this deployment kind of activity so please help how to resolve it

@Roshani30 Roshani30 added the kind/question Questions that haven't been identified as being feature requests or bugs. label Apr 27, 2020
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Apr 27, 2020
@tolusha
Copy link
Contributor

tolusha commented Apr 27, 2020

It seems connectivity problem
Try to deploy with --k8spodwaittimeout=600000
Also che-operator logs will be usefull to check: chectl server:logs

@Roshani30
Copy link
Author

Still no luck

chectl server:start --platform=k8s --installer=operator --domain=10.101.52.51.nip.io --k8spodwaittimeout=600000
Set current context to 'kubernetes-admin@kubernetes'
✔ Verify Kubernetes API...OK
✔ 👀 Looking for an already existing Eclipse Che instance
✔ Verify if Eclipse Che is deployed into namespace "che"...it is not
✈️ Kubernetes preflight checklist
✔ Verify if kubectl is installed
✔ Verify remote kubernetes status...done.
✔ Check Kubernetes version: Found v1.18.2.
✔ Verify domain is set...set to 10.101.52.51.nip.io.
↓ Check if cluster accessible [skipped]
Eclipse Che logs will be available in '/tmp/chectl-logs/1587991270828'
✔ Start following logs
✔ Start following Operator logs...done
✔ Start following Eclipse Che logs...done
✔ Start following Postgres logs...done
✔ Start following Keycloak logs...done
✔ Start following Plugin registry logs...done
✔ Start following Devfile registry logs...done
✔ Start following events
✔ Start following namespace events...done
› Warning: Self-signed certificate is used, so "--self-signed-cert" option is required. Added automatically.
✔ 🏃‍ Running the Eclipse Che operator
✔ Copying operator resources...done.
✔ Create Namespace (che)...It already exists.
✔ Checking for pre-created TLS secret... "che-tls" secret found
✔ Checking certificate... self-signed
✔ Create ServiceAccount che-operator in namespace che...done.
✔ Create Role che-operator in namespace che...done.
✔ Create ClusterRole che-operator...It already exists.
✔ Create RoleBinding che-operator in namespace che...done.
✔ Create ClusterRoleBinding che-operator...It already exists.
✔ Create CRD checlusters.org.eclipse.che...It already exists.
✔ Waiting 5 seconds for the new Kubernetes resources to get flushed...done.
✔ Create deployment che-operator in namespace che...done.
✔ Create Eclipse Che cluster eclipse-che in namespace che...done.
❯ ✅ Post installation checklist
✔ PostgreSQL pod bootstrap
✔ scheduling...done.
✔ downloading images...done.
✔ starting...done.
❯ Keycloak pod bootstrap
✖ scheduling
→ ERR_TIMEOUT: Timeout set to pod wait timeout 600000. podExist: false, currentPhase: undefined
downloading images
starting
Devfile registry pod bootstrap
Plugin registry pod bootstrap
Eclipse Che pod bootstrap
Retrieving Eclipse Che server URL
Eclipse Che status check
› Error: Error: ERR_TIMEOUT: Timeout set to pod wait timeout 600000. podExist: false, currentPhase: undefined
› Installation failed, check logs in '/tmp/chectl-logs/1587991270828'


chectl server:logs
Set current context to 'kubernetes-admin@kubernetes'
Eclipse Che logs will be available in '/tmp/chectl-logs/1587992139479'
✔ Verify Kubernetes API...OK
✔ Verify if namespace 'che' exists
✔ Read Operator logs...done
✔ Read Eclipse Che logs...done
✔ Read Postgres logs...done
✔ Read Keycloak logs...done
✔ Read Plugin registry logs...done
✔ Read Devfile registry logs...done
✔ Read namespace events...done
Command server:logs has completed successfully.

@tolusha
Copy link
Contributor

tolusha commented Apr 27, 2020

pls attach
'/tmp/chectl-logs/1587992139479'

@Roshani30
Copy link
Author

event logs:
/tmp/chectl-logs/1587992139479/che# cat events.txt
LAST SEEN TYPE REASON OBJECT MESSAGE
Normal Scheduled pod/che-operator-59c9b8cb9b-n8jdf Successfully assigned che/che-operator-59c9b8cb9b-n8jdf to osboxes
14m Normal Pulling pod/che-operator-59c9b8cb9b-n8jdf Pulling image "quay.io/eclipse/che-operator:7.12.0"
13m Normal Pulled pod/che-operator-59c9b8cb9b-n8jdf Successfully pulled image "quay.io/eclipse/che-operator:7.12.0"
13m Normal Created pod/che-operator-59c9b8cb9b-n8jdf Created container che-operator
13m Normal Started pod/che-operator-59c9b8cb9b-n8jdf Started container che-operator
14m Normal SuccessfulCreate replicaset/che-operator-59c9b8cb9b Created pod: che-operator-59c9b8cb9b-n8jdf
14m Normal ScalingReplicaSet deployment/che-operator Scaled up replica set che-operator-59c9b8cb9b to 1
Warning FailedScheduling pod/postgres-5b96747d54-2n2jq running "VolumeBinding" filter plugin for pod "postgres-5b96747d54-2n2jq": pod has unbound immediate PersistentVolumeClaims
Warning FailedScheduling pod/postgres-5b96747d54-2n2jq running "VolumeBinding" filter plugin for pod "postgres-5b96747d54-2n2jq": pod has unbound immediate PersistentVolumeClaims
Normal Scheduled pod/postgres-5b96747d54-2n2jq Successfully assigned che/postgres-5b96747d54-2n2jq to osboxes
13m Normal SuccessfulAttachVolume pod/postgres-5b96747d54-2n2jq AttachVolume.Attach succeeded for volume "pvc-6c1bcaf2-7491-4a79-8edc-279dcd165cd7"
13m Normal Pulled pod/postgres-5b96747d54-2n2jq Container image "centos/postgresql-96-centos7:9.6" already present on machine
13m Normal Created pod/postgres-5b96747d54-2n2jq Created container postgres
13m Normal Started pod/postgres-5b96747d54-2n2jq Started container postgres
13m Normal SuccessfulCreate replicaset/postgres-5b96747d54 Created pod: postgres-5b96747d54-2n2jq
13m Normal ExternalProvisioning persistentvolumeclaim/postgres-data waiting for a volume to be created, either by external provisioner "hostpath.csi.k8s.io" or manually created by system administrator
13m Normal Provisioning persistentvolumeclaim/postgres-data External provisioner is provisioning volume for claim "che/postgres-data"
13m Normal ProvisioningSucceeded persistentvolumeclaim/postgres-data Successfully provisioned volume pvc-6c1bcaf2-7491-4a79-8edc-279dcd165cd7
13m Normal ScalingReplicaSet deployment/postgres Scaled up replica set postgres-5b96747d54 to 1

@Roshani30
Copy link
Author

Che-operator logs:
time="2020-04-27T12:53:54Z" level=info msg="Creating a new object: Ingress, name che"
time="2020-04-27T12:53:55Z" level=info msg="Waiting on ingress 'che-host' to be ready"
time="2020-04-27T12:53:55Z" level=error msg="admission webhook "validate.nginx.ingress.kubernetes.io" denied the request: \n-------------------------------------------------------------------------------\nError: exit status 1\n2020/04/27 12:53:55 [emerg] 533#533: duplicate location "/" in /tmp/nginx-cfg646708237:552\nnginx: [emerg] duplicate location "/" in /tmp/nginx-cfg646708237:552\nnginx: configuration file /tmp/nginx-cfg646708237 test failed\n\n-------------------------------------------------------------------------------\n"

@tolusha tolusha added kind/bug Outline of a bug - must adhere to the bug report template. and removed kind/question Questions that haven't been identified as being feature requests or bugs. labels Apr 27, 2020
@tolusha
Copy link
Contributor

tolusha commented Apr 27, 2020

@Roshani30
seems a bug.
pls update the description

@tolusha tolusha added the area/install Issues related to installation, including offline/air gap and initial setup label Apr 27, 2020
@ibuziuk ibuziuk added severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Apr 27, 2020
@Roshani30
Copy link
Author

can you explain exact which description

@tolusha
Copy link
Contributor

tolusha commented Apr 28, 2020

@Roshani30
pls have a look at a very first comment. I've updated it. pls. select check boxes.

@Roshani30
Copy link
Author

okay Done

@tolusha
Copy link
Contributor

tolusha commented Apr 28, 2020

@Roshani30
Thank you. Also I am interested to know which manual you have been followed to prepare kubernetes infrastructure. thank you in advance

@tolusha
Copy link
Contributor

tolusha commented Apr 28, 2020

From the error it seems that ngix is not properly configured
could you check out this file? /tmp/nginx-cfg646708237

@Roshani30
Copy link
Author

@tolusha tolusha added this to the Backlog - Deploy milestone May 6, 2020
@tolusha tolusha mentioned this issue May 8, 2020
56 tasks
@tolusha tolusha removed this from the Backlog - Deploy milestone Jun 1, 2020
@che-bot
Copy link
Contributor

che-bot commented Jan 4, 2021

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.

@che-bot che-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 4, 2021
@che-bot che-bot closed this as completed Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/bug Outline of a bug - must adhere to the bug report template. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

4 participants