Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tests/e2e): ensure spawned cmds are closed #18664

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

ghouscht
Copy link
Contributor

@ghouscht ghouscht commented Oct 1, 2024

This should fix #18412.

After the test case is done Go's testing pkg tries to cleanup the temporary directories that were created by calls to TempDir(). This cleanup sometimes failed with errors like:

testing.go:1231: TempDir RemoveAll cleanup: unlinkat /tmp/TestV2DeprecationWriteOnlySnapshot1900751190/002/member-0/member/snap: directory not empty

I belive the cleanup sometimes happened before the etcd process was stopped which lead to the above error. Adding the defer statement to make sure the process is done before the test case returns seems to fix the error for me.

Before my changes I was able to reliably reproduce the issue with

go test -v -count 1000 -run "^TestV2DeprecationWriteOnlySnapshot$"

and with the changes from this PR I'm no longer able to reproduce this.

Signed-off-by: Thomas Gosteli <thomas.gosteli@protonmail.ch>
@k8s-ci-robot
Copy link

Hi @ghouscht. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.80%. Comparing base (68e7122) to head (7bedbe6).
Report is 2 commits behind head on main.

Current head 7bedbe6 differs from pull request most recent head ddf0ac2

Please upload reports for the commit ddf0ac2 to get more accurate results.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

see 25 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #18664      +/-   ##
==========================================
+ Coverage   68.77%   68.80%   +0.02%     
==========================================
  Files         420      420              
  Lines       35535    35535              
==========================================
+ Hits        24440    24449       +9     
- Misses       9660     9662       +2     
+ Partials     1435     1424      -11     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68e7122...ddf0ac2. Read the comment docs.

@ahrtr
Copy link
Member

ahrtr commented Oct 1, 2024

/ok-to-test

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thank you @ghouscht

@ahrtr
Copy link
Member

ahrtr commented Oct 1, 2024

@ghouscht could you please also backport the fix to 3.5 and 3.4 if needed?

@ivanvc
Copy link
Member

ivanvc commented Oct 1, 2024

Quick (and maybe naive) question. As I was checking if we needed to backport, I realized that the test doesn't exist for the release-3.5 branch. However, I also noticed that we're also calling an etcd spawn in other places in the tests. i.e., refer to:

proc, err := e2e.SpawnCmd([]string{e2e.BinPath.Etcd, "--v2-deprecation=write-only", "--data-dir=" + memberDataDir}, nil)

Do we need to close the command in those instances, too? Or is it only relevant to this test because we're creating a snapshot?

@ghouscht
Copy link
Contributor Author

ghouscht commented Oct 2, 2024

Quick (and maybe naive) question. As I was checking if we needed to backport, I realized that the test doesn't exist for the release-3.5 branch. However, I also noticed that we're also calling an etcd spawn in other places in the tests. i.e., refer to:

proc, err := e2e.SpawnCmd([]string{e2e.BinPath.Etcd, "--v2-deprecation=write-only", "--data-dir=" + memberDataDir}, nil)

Do we need to close the command in those instances, too? Or is it only relevant to this test because we're creating a snapshot?

It seems to not matter for the other tests in this file. Most likely because they seem to test certain flag values that aren't supported so nothing is written to the filesystem. But as a rule of thumb I'd say whenever SpawnCmd is called also Close should be called. If it doesn't help it also doesn't hurt, no? I can open another PR for that if you want me to.

@ahrtr
Copy link
Member

ahrtr commented Oct 2, 2024

refer to:

proc, err := e2e.SpawnCmd([]string{e2e.BinPath.Etcd, "--v2-deprecation=write-only", "--data-dir=" + memberDataDir}, nil)

Do we need to close the command in those instances, too?

In that case, etcdserver will panic and exit automatically. But It will not do any harm to call proc.Close(). Please feel free to raise a separate PR to fix it if anyone wants, thx

@ivanvc
Copy link
Member

ivanvc commented Oct 2, 2024

In that case, etcdserver will panic and exit automatically. But It will not do any harm to call proc.Close(). Please feel free to raise a separate PR to fix it if anyone wants, thx

Thanks for the explanation. I wasn't sure if that was required. I'm fine whether we address it in a follow-up or not (as it shouldn't cause an issue).

As for release-3.4 and release-3.5, I think we don't have that test case. But feel free to double-check, @ghouscht.

Copy link
Member

@ivanvc ivanvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks so much for your work toward addressing flaky tests, Thomas.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, ghouscht, ivanvc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr ahrtr merged commit 5a01649 into etcd-io:main Oct 2, 2024
41 checks passed
@ghouscht ghouscht deleted the issue-18412 branch October 3, 2024 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Detected disallowed custom content in v2store
5 participants