-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatible with pytest-xdist / thread safety #535
Comments
I'll need to dig into how pytest-xdist operates. Currently when in "write mode" (snapshot update flag passed) we write immediately when making the assertion. Buffering and deferring this to the session end hook where we already delete unused snapshots shouldn't be too difficult. |
I'm re-examining where we do the reads/writes in an effort to solve #539. I'm hoping that work will also solve the concurrency issue with pytest-xdist. |
I'm also using pytest-xdist and want to use syrupy in my test suite. I run into a different problem. Just having syrupy installed will break pytest when using the pytest-xdist option
If I uninstall syrupy the error goes away. I have not added any syrupy-specific code to any of my tests.
|
There's been some more attention around this issue. It's been prioritized as the next item to work on -- actually moved to In Progress. I'll post updates as they come up. |
@noahnu, I'm coming from pytest-dev/pytest#9600, where I experienced the same issues as @rassie. Didn't quite manage to isolate the issue to whether it was syrupy, pytest or pytest-xdist at first, but good that I found this thread 👍 Do you have any updates of the progress on this issue/anywhere you left off that we might pick up to resolve this? :) |
I haven't had a chance to work on this just yet. I previously refactored some code to boost performance. Buffering the writes should still be a feasible fix here. If you're interested in taking this on, I think we'd want to batch and then defer this operation https://github.com/tophat/syrupy/blob/ec4cbac3fa71eb7dad276005f9433ce26c2f1f9a/src/syrupy/assertion.py#L197 (the write_snapshot call), collect them for after the tests finish running, and flush them out to disk at the same time in some coordinated fashion (still need to look into pytest-xdist to see how that may be possible). The hooks are located in the Might actually take a stab at it now. |
I started digging into the xdist implementation, and this won't be as straightforward as I hoped. Definitely open to help/contributions here. It looks like each worker manages its own session. So where/when should we write snapshots? Can two workers get nodes from the same test file? In which case, we might have 2 workers trying to write to the same file? Even if we overcame that obstacle, we run into a problem with unused snapshot detection. We need to somehow defer our snapshot analysis until after all the workers have been cleaned up. |
Sounds great, thanks for the update. I haven't dived into pytest-xdist, or syrupy yet, as this is my first dive into the such details and OSS contributions! I'll gladly explore this, but might take a bit more time to get into. Will join the Discord as well, for discussions, which is great anyway! Any starting point I can base off from, when you looked at yesterday? Writer/Reader/session handler? I might contribute with some constructive ideas if I dive a bit more into the pytest-xdist details |
Will try take a deeper look this weekend but what needs to be clarified:
If I can't solve it immediately, I'll do some refactoring in syrupy to hopefully make it easier to reason about. I've been meaning to do some serious refactoring to the syrupy internals for a while and have been planning it for a v2 release since it'll affect plugin development. |
From the looks of it, it looks like from what I can see in e.g. So with a RR schedule and an initial batch, the test items are distributed equally to each worker/node, with the Could we maybe use the return value for this to figure out which tests are done? The |
When running all the tests the performance degrades quickly due to the number of threads growing with every test. Use pytest-xdist plugin to run the tests concurrently in different processes. Splitting the execution across multiple workers significantly reduces the impact of performance issues when spawning mulitple apps. Syrupy does not support xdist: syrupy-project/syrupy#535. Move to pytest-snapshot since that is the only snapshot library that supports xdist. Update snapshot reporter plugin to work with pytest-snapshot.
Use pytest-xdist plugin to run the tests concurrently in different processes. Splitting the execution across multiple workers significantly reduces the time to run the tests. Syrupy does not support xdist: syrupy-project/syrupy#535. Move to pytest-snapshot since that is the only snapshot library that supports xdist. Update snapshot reporter plugin to work with pytest-snapshot.
I've started some refactoring to defer the snapshot writes until after all tests finish running. Previously we were writing snapshots as part of the test case assertions. At least for writes, I think we can grab a file lock in the temp directory belonging to the xdist controller, essentially following the docs from https://pytest-xdist.readthedocs.io/en/latest/how-to.html#making-session-scoped-fixtures-execute-only-once. Ensuring the "unused" snapshot logic still works will be a bit trickier. One naive solution would be to have each worker write the unused snapshot names to some temp file and then have the controller somehow do the cleanup by merging the unused files (i.e. a test file is only considered unused if it exists in exactly all worker files, so we essentially do the insersection of all unused). |
Taking a stab at this over the weekend! ⚒️ |
@mcataford you'll want to work off of the next branch. I was running into performance issues which will need to be tackled first. We're running benchmarks against the main branch, so you can use those benchmarks as a reference. |
Circling back from post-weekend updates here -- I've experimented and done some diagramming to explore and understand the order-of-operation, hooks and where changes should fit in. Haven't reached a point where I have changes I can push up, but working up to it. |
Heya 👋 Just wanted to share some hacky ideas in case they can be of help: (1) I understand that this would be a breaking change; this being said, a migration path that reads existing snapshots files and splits them to the new file-per-test would probably be easier than exploring the innards of the pytest-xdist plugins. (2) -- Ngl, it seems to me that the python testing tools ecosystem is fragmented to a point where it hinders the maintenance and stabilization of a rich & coherent suite of tools like in Javascript. Treating other plugins as a black box to me seem like a way to reduce headaches. |
Working my way through this. #535 (comment) is addressed by #667 (going into the next branch / v4.0.0). It looks like pytest-xdist expects all pytest options to be serializable. This is not a requirement of pytest itself so I'd argue it's really a pytest-xdist limitation. That being said, it's easy enough to fix so I've added the change to the v4 release. It's a bit too niche to cover with a test but may loop back to figure out how to prevent regressions. I've added a comment in the code for now. |
The JSON extension works like this but I suspect it also doesn't work too well with pytest-xdist. There's a use case for 1 file per test case, but I can see good arguments against it, for example consider the use of parametrization. You probably don't want 1000s of snapshot files per test case. Especially since these are expected to be committed to version control. I'm trying my best to see if syrupy can be refactored/improved to the point where minimal work is required for pytest-xdist compatibility. Using this as an opportunity to also cleanup the architecture a bit.
pytest-xdist already has the concept of a controller process. I'm hoping to piggyback on this. We'll see. Whatever we do to ensure pytest-xdist support, it shouldn't negatively impact our non-xdist users.
Syrupy itself doesn't leverage any "hacks" to do its job. It 100% respects the pytest plugin API. It's pytest-xdist which is doing some odd things to add parallelization to pytest. This breaks some assumptions when working with the pytest API. This is why we can't just treat pytest-xdist as a black box, because it does in fact change the way tests are run. |
Describe the bug
I'm using
pytest-xdist
to run my tests in parallel (mostly a single test parameterized for a vast amount of different test data). Since I wanted to try snapshot-based testing, I've addedsyrupy
to the mix. After initial creation of__snapshots__/mytest.ambr
(using the--snapshot-update
), it disappears from disk at some point during the test run. Withoutpytest-xdist
(i.e. without using-n auto
option) this doesn't happen.To reproduce
I don't have a clean reproduction, only an observation. My setup is a single test for a bit of processing, parameterized with a couple of hundreds different input files. While the
.ambr
file appear initially, after about 80% of tests done, the file disappears.Expected behavior
Snapshot files get updated with correct data without disappearing, even when used with
pytest-xdist
orpytest-parallel
.Environment:
Additional context
I assume the problem happens because each test execution writes to the file individually and thus a race condition happens. Not sure what can be done about it, maybe buffer the snapshot file somewhere and write it only at the end of the test run? There are surely better ideas. I assume that e.g. the JUnit reporter works correctly with
pytest-xdist
, maybe check what it does?Either way, I wanted to bring this to your attention, would be happy if there was a short- to middle-term solution. If not, I'll run the suite without parallelization, it's not that big of a deal at the moment, at least for me :)
The text was updated successfully, but these errors were encountered: