-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm Chart Volume Configuration "ReadWriteOnce" Doesn't Work with Multiple Pods #6096
Comments
Do we actually need these pods to all share storage? Or can they each be granted individual PVCs for each instance of the server/utils/workers pods? Sharing a cache a la volumes seems like a poor design choice so I question if this is just an assumption made in the docker-compose configuration and just found it's way into the helm chart. |
Is there a reason that ReadWriteOnce is hardcoded? Im not a cloud developer so im just curious if switching it to ReadWriteMany has negative side effects |
I'm unsure why it is needed that all share the same claim. Maybe someone more familiar with the chart knows? It might have some unintended consequences if we remove it. |
<!-- Raise an issue to propose your change (https://github.com/opencv/cvat/issues). It helps to avoid duplication of efforts from multiple independent contributors. Discuss your ideas with maintainers to be sure that changes will be approved and merged. Read the [Contribution guide](https://opencv.github.io/cvat/docs/contributing/). --> <!-- Provide a general summary of your changes in the Title above --> ### Motivation and context <!-- Why is this change required? What problem does it solve? If it fixes an open issue, please link to the issue here. Describe your changes in detail, add screenshots. --> Right now helm chart is broken and not usable at least in my environment, I trying to fix it to make it work content: 1. Moved test-related values to another values.file to separate it from default config 2. fixed issue with multiple caches in same RWX volume, which prevents db migration to start 3. Removed hardcoded mandatory traefik ingress usage 4. Added confugurable default storage option to chart ### How has this been tested? <!-- Please describe in detail how you tested your changes. Include details of your testing environment, and the tests you ran to see how your change affects other areas of the code, etc. --> We test it on our AKS with RWX volume ### Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply. If an item isn't applicable for some reason, then ~~explicitly strikethrough~~ the whole line. If you don't do that, GitHub will show incorrect progress for the pull request. If you're unsure about any of these, don't hesitate to ask. We're here to help! --> - [x] I submit my changes into the `develop` branch - [x] I have added a description of my changes into the [CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md) file - [x] I have updated the documentation accordingly - [x] I have added tests to cover my changes - [x] I have linked related issues (see [GitHub docs]( https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)) - [x] I have increased versions of npm packages if it is necessary ([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning), [cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning), [cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning) and [cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning)) ### License - [x] I submit _my code changes_ under the same [MIT License]( https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern. closes #6043 closes #6096 closes #5733 --------- Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai> Co-authored-by: Nikita Manovich <nikita@cvat.ai> Co-authored-by: Andrey Zhavoronkov <andrey@cvat.ai>
<!-- Raise an issue to propose your change (https://github.com/opencv/cvat/issues). It helps to avoid duplication of efforts from multiple independent contributors. Discuss your ideas with maintainers to be sure that changes will be approved and merged. Read the [Contribution guide](https://opencv.github.io/cvat/docs/contributing/). --> <!-- Provide a general summary of your changes in the Title above --> ### Motivation and context <!-- Why is this change required? What problem does it solve? If it fixes an open issue, please link to the issue here. Describe your changes in detail, add screenshots. --> Right now helm chart is broken and not usable at least in my environment, I trying to fix it to make it work content: 1. Moved test-related values to another values.file to separate it from default config 2. fixed issue with multiple caches in same RWX volume, which prevents db migration to start 3. Removed hardcoded mandatory traefik ingress usage 4. Added confugurable default storage option to chart ### How has this been tested? <!-- Please describe in detail how you tested your changes. Include details of your testing environment, and the tests you ran to see how your change affects other areas of the code, etc. --> We test it on our AKS with RWX volume ### Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply. If an item isn't applicable for some reason, then ~~explicitly strikethrough~~ the whole line. If you don't do that, GitHub will show incorrect progress for the pull request. If you're unsure about any of these, don't hesitate to ask. We're here to help! --> - [x] I submit my changes into the `develop` branch - [x] I have added a description of my changes into the [CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md) file - [x] I have updated the documentation accordingly - [x] I have added tests to cover my changes - [x] I have linked related issues (see [GitHub docs]( https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)) - [x] I have increased versions of npm packages if it is necessary ([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning), [cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning), [cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning) and [cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning)) ### License - [x] I submit _my code changes_ under the same [MIT License]( https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern. closes cvat-ai#6043 closes cvat-ai#6096 closes cvat-ai#5733 --------- Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai> Co-authored-by: Nikita Manovich <nikita@cvat.ai> Co-authored-by: Andrey Zhavoronkov <andrey@cvat.ai>
My actions before raising this issue
When using the Helm Chart, the storage.yml ReadWriteOnce option under the backend folder structure doesn't work in my K8s environment. If you have deployments across multiple nodes, ReadWriteOnce will form a race condition causing only one backend pod to start successfully. ReadWriteOnce by definition only allows one node to mount - so how is this supposed to work?
Steps to Reproduce (for bugs)
Expected Behaviour
Current Behaviour
Possible Solution
The text was updated successfully, but these errors were encountered: