Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect volume name when reusing PVCs #1122

Merged
merged 18 commits into from
Jul 1, 2019

Conversation

pebrc
Copy link
Collaborator

@pebrc pebrc commented Jun 20, 2019

Part of #877

  • puts the volume name into the PVCs labels to include it in the comparison
  • enables the previously commented e2e that tests the default case of reuse
  • adds another e2e test to make sure PVCs are reused while respecting their name

I ran into an issue with scheduling pods with more than one PVC, because we use volumeBindingMode: Immediate more often than not the volumes would be allocated in two different zones which meant that the pod became unschedulable as it cannot be in two zones at once ...

The solution I found was to switch to volumeBindingMode: WaitForFirstConsumer which creates the volumes only once the pod has been scheduled to a node. The only problem with this is that in k8s 1.11 this does not work because volumes need to be pre-provisioned. In 1.12 DynamicProvisioningScheduling is no longer feature gated and volume provisioning with late binding works as expected. So this PR also

  • adds a storage class template with WaitForFirstConsumer
  • ups the default k8s version to 1.12
  • skips the new e2e test on 1.11

@pebrc pebrc requested review from sebgl and barkbay June 20, 2019 17:10
Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with the label fix, maybe not that much with the e2e test approach.

At first I thought the storageClass patch was not the right way to solve the problem: patching our e2e tests to patch the default GKE storageClass does not help ECK users to not have this problem.
But looking more at the doc around PVCs it looks like using PVs without volumeBindingMode: waitForFirstConsumer is kind of broken by design. There is no other way to deal with zonal volumes and affinity rules.

I think maybe we should stick to using GKE 1.11 by default (it's still GCP default?). Skip the multi-PV test on k8s<1.12, create a new storageClass with a custom name with volumeBindingMode: waitForFirstConsumer only in the test that requires it, and use it only there.
This way other tests still rely on GKE defaults, and help us notice anything wrong for a "default" usage?

operators/config/dev/default-storage.yaml Outdated Show resolved Hide resolved
operators/test/e2e/failure_test.go Show resolved Hide resolved
operators/Makefile Show resolved Hide resolved
}

func TestKillCorrectPVReuse(t *testing.T) {
s := stack.NewStackBuilder("test-failure-pvc").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should maybe patch the storageClass for this test only (like: in the test).
Also skip this test if the k8s version we are testing against is <1.12.

@sebgl
Copy link
Contributor

sebgl commented Jun 21, 2019

I think we should take into consideration the pod name label when retrieving several PVCs for a single pod.
Use case: I have 2 PVCs for my pod, one for data, another one for logs. When these PVCs get reused for another pod, I want to keep logs and data tied together. It would be confusing to have the logs volume of one pod mounted along with the data volume of another pod.
We don't care about the actual pod name, but we do care about all volumes we mount for a pod to have the same pod name?

@pebrc
Copy link
Collaborator Author

pebrc commented Jun 25, 2019

I think we should take into consideration the pod name label when retrieving several PVCs for a single pod.

Do you think we should try to address this in this PR?

@sebgl
Copy link
Contributor

sebgl commented Jun 25, 2019

I think we should take into consideration the pod name label when retrieving several PVCs for a single pod.

Do you think we should try to address this in this PR?

I'm fine with doing it in a follow-up PR, but I think #877 cannot be considered fixed until we do it (or another issue is needed).

@pebrc
Copy link
Collaborator Author

pebrc commented Jun 25, 2019

@sebgl I think I have a slight preference for doing another PR. I changed this PRs description so that it does not auto-close the issue.

@pebrc
Copy link
Collaborator Author

pebrc commented Jun 25, 2019

@sebgl can you take another look at this PR. I tried to address your feedback.

@pebrc
Copy link
Collaborator Author

pebrc commented Jun 25, 2019

@sebgl I removed the external definition of a provider specific storage class and the corresponding flag. Instead I am using the existing default storage class as a template to create a derivative with late volume binding as discussed. 👍 for the idea.

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes 👍 it's nice to keep these tests independent of any cloud provider.
I left 2 minor comments, and one I think is very important (hence "changes requested"). Otherwise LGTM.

operators/test/e2e/failure_test.go Show resolved Hide resolved
operators/test/e2e/params/params.go Outdated Show resolved Hide resolved
}
for _, pod := range pods {
if stringsutil.StringInSlice(pod.Name, survivingPodNames) {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point chances are the deleted pod is not back into the cluster yet, so we only iterate on pods we don't care about, continue on each one, then return nil? Which skips the test entirely?
Should we run WithSteps(stack.CheckStackSteps(s, k)...). first, then this test? Also probably simpler to get the expected pod through its name directly instead of filtering on pods we don't care about?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was pure gold. You should get a🥇 for that. It surfaced that the actual fix was not fixing anymore since I merged master into this branch 😞 which was hidden by this flaw in the test ...

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pebrc pebrc merged commit 83a0c81 into elastic:master Jul 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working v0.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants