Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm Chart Volume Configuration "ReadWriteOnce" Doesn't Work with Multiple Pods #6096

Closed
2 tasks done
moghtader opened this issue May 2, 2023 · 3 comments · Fixed by #6137
Closed
2 tasks done

Helm Chart Volume Configuration "ReadWriteOnce" Doesn't Work with Multiple Pods #6096

moghtader opened this issue May 2, 2023 · 3 comments · Fixed by #6137

Comments

@moghtader
Copy link

moghtader commented May 2, 2023

My actions before raising this issue

When using the Helm Chart, the storage.yml ReadWriteOnce option under the backend folder structure doesn't work in my K8s environment. If you have deployments across multiple nodes, ReadWriteOnce will form a race condition causing only one backend pod to start successfully. ReadWriteOnce by definition only allows one node to mount - so how is this supposed to work?

Steps to Reproduce (for bugs)

  1. Attempt to deploy the helm chart to Kubernetes with the default storage

Expected Behaviour

  1. The defaults for the helm chart should be able to launch CVAT in K8s with the minimal amount of modification.

Current Behaviour

Possible Solution

  1. Changing storage.yml to ReadWriteMany
  2. Put all backend deployments into the same node (maybe there's a reason this isn't designed this way though)
@westernspion
Copy link

Do we actually need these pods to all share storage? Or can they each be granted individual PVCs for each instance of the server/utils/workers pods? Sharing a cache a la volumes seems like a poor design choice so I question if this is just an assumption made in the docker-compose configuration and just found it's way into the helm chart.

@Zanz2
Copy link

Zanz2 commented May 24, 2023

Is there a reason that ReadWriteOnce is hardcoded? Im not a cloud developer so im just curious if switching it to ReadWriteMany has negative side effects

@moghtader
Copy link
Author

I'm unsure why it is needed that all share the same claim. Maybe someone more familiar with the chart knows? It might have some unintended consequences if we remove it.

bsekachev pushed a commit that referenced this issue Jul 25, 2023
<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
Right now helm chart is broken and not usable at least in my
environment, I trying to fix it to make it work
content: 

1. Moved test-related values to another values.file to separate it from
default config
2. fixed issue with multiple caches in same RWX volume, which prevents
db migration to start
3. Removed hardcoded mandatory traefik ingress usage
4. Added confugurable default storage option to chart

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->
We test it on our AKS with RWX volume

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have added a description of my changes into the
[CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md)
file
- [x] I have updated the documentation accordingly
- [x] I have added tests to cover my changes
- [x] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))
- [x] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
  
closes #6043 
closes #6096 
closes #5733

---------

Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai>
Co-authored-by: Nikita Manovich <nikita@cvat.ai>
Co-authored-by: Andrey Zhavoronkov <andrey@cvat.ai>
mikhail-treskin pushed a commit to retailnext/cvat that referenced this issue Oct 25, 2023
<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
Right now helm chart is broken and not usable at least in my
environment, I trying to fix it to make it work
content: 

1. Moved test-related values to another values.file to separate it from
default config
2. fixed issue with multiple caches in same RWX volume, which prevents
db migration to start
3. Removed hardcoded mandatory traefik ingress usage
4. Added confugurable default storage option to chart

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->
We test it on our AKS with RWX volume

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have added a description of my changes into the
[CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md)
file
- [x] I have updated the documentation accordingly
- [x] I have added tests to cover my changes
- [x] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))
- [x] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
  
closes cvat-ai#6043 
closes cvat-ai#6096 
closes cvat-ai#5733

---------

Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai>
Co-authored-by: Nikita Manovich <nikita@cvat.ai>
Co-authored-by: Andrey Zhavoronkov <andrey@cvat.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants