Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stablize cargo test #277

Merged
merged 10 commits into from
Sep 7, 2021
Merged

Stablize cargo test #277

merged 10 commits into from
Sep 7, 2021

Conversation

yihuaf
Copy link
Collaborator

@yihuaf yihuaf commented Sep 6, 2021

The root cause of the flakiness observed in the hook, clean up file descriptors, and channel tests are the result of running the tests in parallel in the same process by cargo test. Many of the operations have side effects on the underlying OS and the process, such as closing down fds. These tests also depends on the underlying OS state such as file descriptor states. The best we can do is to mark these tests as serial and properly clean up any left over resources such as opened fds before the test ends. If the underlying OS decided to generate errors for some of the IO calls we are doing, then there is not much we can do. Not sure if retry is a valid solution here and retry generally will make the code much more complex.

Note: runc actually doesn't have unit tests on most of these operations. The other solution is to fully depend on the integration tests to test out these code path. It may be the reasonable thing to do if these unit tests prove to be hard to get right.

Note2: Currently, I ran cargo test for 100 iterations and no failed tests. There may still be flakiness, but it should be extremely rare.

@codecov-commenter
Copy link

codecov-commenter commented Sep 6, 2021

Codecov Report

Merging #277 (4c5d2d6) into main (ef9a92a) will increase coverage by 0.01%.
The diff coverage is 53.84%.

@@            Coverage Diff             @@
##             main     #277      +/-   ##
==========================================
+ Coverage   69.57%   69.59%   +0.01%     
==========================================
  Files          46       46              
  Lines        5710     5651      -59     
==========================================
- Hits         3973     3933      -40     
+ Misses       1737     1718      -19     

channel test parent process should call waitpid in the end. Parent should be
able to wait before child process completly exits.

For our purpose, rust test runner don't like multi processes. Serial is
enough to work around this issue.
@utam0k
Copy link
Member

utam0k commented Sep 7, 2021

Thank you for your investigation. Can I ask you to write a clear description of the results of this investigation in the section of the test that we are using serial other tests?

The root cause of the flakiness observed in the hook, clean up file descriptors, and channel tests are the result of running the tests in parallel in the same process by cargo test. Many of the operations have side effects on the underlying OS and the process, such as closing down fds. These tests also depends on the underlying OS state such as file descriptor states. The best we can do is to mark these tests as serial and properly clean up any left over resources such as opened fds before the test ends. If the underlying OS decided to generate errors for some of the IO calls we are doing, then there is not much we can do. Not sure if retry is a valid solution here and retry generally will make the code much more complex.

@yihuaf
Copy link
Collaborator Author

yihuaf commented Sep 7, 2021

Thank you for your investigation. Can I ask you to write a clear description of the results of this investigation in the section of the test that we are using serial other tests?

The root cause of the flakiness observed in the hook, clean up file descriptors, and channel tests are the result of running the tests in parallel in the same process by cargo test. Many of the operations have side effects on the underlying OS and the process, such as closing down fds. These tests also depends on the underlying OS state such as file descriptor states. The best we can do is to mark these tests as serial and properly clean up any left over resources such as opened fds before the test ends. If the underlying OS decided to generate errors for some of the IO calls we are doing, then there is not much we can do. Not sure if retry is a valid solution here and retry generally will make the code much more complex.

Done. I left the tty test alone because it was the original user of serial tests I think and is not part of this change.

Copy link
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@utam0k utam0k merged commit f026d7e into youki-dev:main Sep 7, 2021
@yihuaf yihuaf deleted the yihuaf/test branch September 7, 2021 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants