Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploying on GKE using the helm chart not working #5029

Closed
AhmedBytesBits opened this issue Oct 3, 2022 · 8 comments · May be fixed by #6476
Closed

Deploying on GKE using the helm chart not working #5029

AhmedBytesBits opened this issue Oct 3, 2022 · 8 comments · May be fixed by #6476
Assignees
Labels

Comments

@AhmedBytesBits
Copy link

AhmedBytesBits commented Oct 3, 2022

My actions before raising this issue

  • Following the official documentation for using the official Helm Chart documentation
  • Deployment of cvat works for all deployments except workers both the low and default
  • Checking Deployment logs using GKE console
Multi-Attach error for volume "pvc-856156a9-8f5e-4ac1-9b5e-e39a490f26c1" Volume is already used by pod(s) mc-backend-worker-default-75bdfc7d9f-4l2w2, mcit-backend-utils-845b68fb99-8wls9, mc-backend-server-74fb85b6c7-pk6kq

Expected Behaviour

Deployment of chart should work on GKE
suggesting a change/improvement, tell us how it should work -->

Current Behaviour

Deployment of utils, default and low worker stuck, which one is stuck is based on competition on the created shared volume

Possible Solution

Avoid using shared folder
or, using affinity to ensure deployment on single node

Steps to Reproduce (for bugs)

  1. Change configuration following instructions
  2. run helm upgrade -n cvat mcit -i --create-namespace ./helm-chart -f ./helm-chart/values.yaml -f ./helm-chart/values.override.yaml

Context

GKE supports Read/Write shared volumes for pods attached to the same node, cross node attachment for Read/Write is not allowed

Your Environment

  • GKE cluster version: 1.22.12-gke.2300
  • cvat release: 2.2
@nmanovic
Copy link
Contributor

nmanovic commented Jan 4, 2023

@ahmedrshdy , do you still have the issue in the latest release?

@grzleadams
Copy link

grzleadams commented Mar 30, 2023

I'm still seeing this on chart v0.7.2 (on-prem RKE, not GKE).

@stykm
Copy link

stykm commented Apr 17, 2023

I'm stuck having the same issue as described in the original post (I'm trying to deploy helm-chart that is currently available (for CVAT v2.4.1)

@work-temp-dl-stuff
Copy link

I still have this exact same issue when trying to deploy in cloud (We dont have ReadWriteMany volumes, i cannot set affinity to make pods run on single node (permissions)

@AhmedBytesBits
Copy link
Author

Unfortunately I gave up k8s deployment

@Keramblock
Copy link
Contributor

Sadly cvat right now is designed to use RWX volume, so you could not proceed without one AFAIK.

Also I believe, that GKE supports them:
https://cloud.google.com/filestore/docs/optimize-multishares

@Zanz2
Copy link

Zanz2 commented Jul 25, 2023

I have a deployment that uses ReadWriteOnce via helm and it works fine, you do have to use podaffinity rules to schedule all the pods on the same node though. Example for every backend pod:

affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: tier
              operator: In
              values:
              - backend
          topologyKey: "kubernetes.io/hostname"

Thats because multiple pods can actually access the same volume, but they must be on the same node. But this is sort of against kubernetes principles.

@azhavoro
Copy link
Contributor

fixed in #6137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants