Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: volume-group-snapshot test observes a panic when using "--namespace" parameter #931

Closed
vincent1chen opened this issue Aug 4, 2023 · 8 comments
Assignees
Labels
area/cert-csi Issues pertaining to Cert CSI type/bug Something isn't working. This is the default label associated with a bug issue.
Milestone

Comments

@vincent1chen
Copy link

Bug Description

when running volume-group-snapshot test with "--namespace" parameter. there see panic happened. this parameter is listed as an optional parameter. it's necessary for metric service usage. i don't see other testcase panic with "--ns".

./cert-csi test volume-group-snapshot -h

[2023-08-04 04:38:12] INFO Starting cert-csi; ver. 0.8.1
NAME:
csi-cert test volume-group-snapshot - test volume group snapshot

USAGE:
csi-cert test volume-group-snapshot [command options] [arguments...]

... ...
--driver value driver name to be used
--reclaimPolicy value set the member reclaim policy, Delete, Retain
--accessMode value set the volume access mode
--volumeNumber value number of volume to create for the group (default: 0)
--config value, --conf value, -c value config for connecting to kubernetes [$KUBECONFIG]
--sc value, --storage value, --storageclass value storage csi [$STORAGE_CLASS]
--namespace value, --ns value specify the driver namespace (used in driver resource usage)

Logs

[root@registry ~]# ./cert-csi test volume-group-snapshot --volumeSnapshotClass powerstore-snapshot --volumeGroupName vgsperf --volumeSize 10Gi --volumeLabel vgstest --volumeNumber 2 --sc powerstore-ext4 --ns csi-powerstore --driver-namespace csi-powerstore
[2023-08-04 05:09:44] INFO Starting cert-csi; ver. 0.8.1
[2023-08-04 05:09:44] INFO Using EVENT observer type
[2023-08-04 05:09:44] INFO Using default config
[2023-08-04 05:09:44] INFO Successfully loaded config. Host: https://192.168.30.2:6443
[2023-08-04 05:09:45] INFO Created new KubeClient
[2023-08-04 05:09:45] INFO Running 1 iteration(s)
[2023-08-04 05:09:45] INFO *** ITERATION NUMBER 1 ***
[2023-08-04 05:09:45] INFO Starting VolumeGroupSnapSuite with powerstore-ext4 storage class
[2023-08-04 05:09:45] INFO Successfully created namespace vgs-snap-test-e21c3fbd
E0804 05:09:45.155389 1611497 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 144 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x224eee0?, 0x3bd2b30})
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x6b2589?})
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/runtime/runtime.go:49 +0x75
panic({0x224eee0, 0x3bd2b30})
/usr/local/go/src/runtime/panic.go:884 +0x213
cert-csi/pkg/observer.(*ContainerMetricsObserver).StartWatching.func1()
/root/cert-csi/pkg/observer/metrics.go:88 +0x70
k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1({0x7f596dcb05b8, 0x18})
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:222 +0x1b
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x28b3e50?, 0xc0000560b0?}, 0xc000674650?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:235 +0x57
k8s.io/apimachinery/pkg/util/wait.poll({0x28b3e50, 0xc0000560b0}, 0x90?, 0xc02e45?, 0x10?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:582 +0x38
k8s.io/apimachinery/pkg/util/wait.PollImmediateWithContext({0x28b3e50, 0xc0000560b0}, 0x0?, 0xc0006746e0?, 0x40fa47?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:528 +0x4a
k8s.io/apimachinery/pkg/util/wait.PollImmediate(0x2565a59?, 0x18?, 0x255a004?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:514 +0x50
cert-csi/pkg/observer.(*ContainerMetricsObserver).StartWatching(0xc00034fe60, {0x0?, 0x0?}, 0xc00032b380)
/root/cert-csi/pkg/observer/metrics.go:79 +0x1ec
created by cert-csi/pkg/observer.(*Runner).Start
/root/cert-csi/pkg/observer/runner.go:76 +0x45
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x19051b0]

goroutine 144 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x6b2589?})
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x224eee0, 0x3bd2b30})
/usr/local/go/src/runtime/panic.go:884 +0x213
cert-csi/pkg/observer.(*ContainerMetricsObserver).StartWatching.func1()
/root/cert-csi/pkg/observer/metrics.go:88 +0x70
k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1({0x7f596dcb05b8, 0x18})
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:222 +0x1b
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x28b3e50?, 0xc0000560b0?}, 0xc000674650?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:235 +0x57
k8s.io/apimachinery/pkg/util/wait.poll({0x28b3e50, 0xc0000560b0}, 0x90?, 0xc02e45?, 0x10?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:582 +0x38
k8s.io/apimachinery/pkg/util/wait.PollImmediateWithContext({0x28b3e50, 0xc0000560b0}, 0x0?, 0xc0006746e0?, 0x40fa47?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:528 +0x4a
k8s.io/apimachinery/pkg/util/wait.PollImmediate(0x2565a59?, 0x18?, 0x255a004?)
/root/go/pkg/mod/k8s.io/apimachinery@v0.26.0/pkg/util/wait/wait.go:514 +0x50
cert-csi/pkg/observer.(*ContainerMetricsObserver).StartWatching(0xc00034fe60, {0x0?, 0x0?}, 0xc00032b380)
/root/cert-csi/pkg/observer/metrics.go:79 +0x1ec
created by cert-csi/pkg/observer.(*Runner).Start
/root/cert-csi/pkg/observer/runner.go:76 +0x45

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

just execute the test

Expected Behavior

the case finish successfully

CSM Driver(s)

CSI powerstore

Installation Type

helm 3.0

Container Storage Modules Enabled

"dellemc/csi-volumegroup-snapshotter:v1.2.0
registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2
registry.k8s.io/sig-storage/csi-external-health-monitor-controller:v0.9.0

Container Orchestrator

K8s 1.24.4

Operating System

debian 11

@vincent1chen vincent1chen added needs-triage Issue requires triage. type/bug Something isn't working. This is the default label associated with a bug issue. labels Aug 4, 2023
@csmbot
Copy link
Collaborator

csmbot commented Aug 4, 2023

@vincent1chen: Thank you for submitting this issue!

The issue is currently awaiting triage. Please make sure you have given us as much context as possible.

If the maintainers determine this is a relevant issue, they will remove the needs-triage label and assign an appropriate priority label.


We want your feedback! If you have any questions or suggestions regarding our contributing process/workflow, please reach out to us at container.storage.modules@dell.com.

@suryagupta4 suryagupta4 self-assigned this Aug 7, 2023
@suryagupta4
Copy link

@vincent1chen
Please try running this test by skipping the --ns / --namespace parameter as it is not required.

We will update the documentation on how to run this test with all the relevant details.

Thanks.

@vincent1chen
Copy link
Author

vincent1chen commented Aug 8, 2023

correctly, without --ns, the test could pass

@adarsh-dell
Copy link
Contributor

Hi @suryagupta4 ,I also share your perspective; however, I believe that even when provided with invalid arguments, an application should ideally avoid crashing or panicking.

Instead, it could return an informative error message, such as indicating that the arguments are unsupported or providing an appropriate response. Please feel free to correct me if I've misunderstood anything.

@adarsh-dell adarsh-dell added the area/cert-csi Issues pertaining to Cert CSI label Aug 8, 2023
@vincent1chen vincent1chen changed the title [BUG]: volume-group-snapshot test is observed panic when using "--namespace" parameter [BUG][cert-csi] : volume-group-snapshot test is observed panic when using "--namespace" parameter Aug 9, 2023
@adarsh-dell
Copy link
Contributor

Hi @vincent1chen I believe you are passing --namespace flag for metrics and metrics client is working fine?
Have you tried out to validate if the metrics is getting generated for other suites when --ns flag is passed?

@vincent1chen
Copy link
Author

Hi @adarsh-dell , I passed several case with "--ns" flag like volumehealthmetrics, volumeIO, snapshot. so i think metrics client works well. I didn't do further check. how to validate if the metrics is getting generated.

@adarsh-dell
Copy link
Contributor

Thanks @vincent1chen This amount of information suffices for now.
It appears that the metrics client isn't the source of the problem, and I've managed to identify the issue. I'll provide you with an update shortly once it's resolved.

Thanks

@adarsh-dell
Copy link
Contributor

Hi @vincent1chen We have fixed this issue, will be avaialble from csm v1.8.0 release onwards.
Thanks for letting us know about the issue.

  • Adarsh

@shaynafinocchiaro shaynafinocchiaro added this to the v1.8.0 milestone Aug 11, 2023
@adarsh-dell adarsh-dell removed the needs-triage Issue requires triage. label Aug 16, 2023
@shaynafinocchiaro shaynafinocchiaro changed the title [BUG][cert-csi] : volume-group-snapshot test is observed panic when using "--namespace" parameter [BUG]: volume-group-snapshot test observes a panic when using "--namespace" parameter Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cert-csi Issues pertaining to Cert CSI type/bug Something isn't working. This is the default label associated with a bug issue.
Projects
None yet
Development

No branches or pull requests

5 participants