-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race conditions when running multiple Sphinx instances using the same doctree #2946
Comments
Well, I've tried to give it a shot… and I can't find a sane way of doing it with the current design. The big problem is, the use of pickles is done quite inconsistently. The pickled environment is first read in To do that, I would have to keep the environment file open during that period. That's quite doable — I was thinking of opening it directly in The problem with all that is that Python doesn't do RAII properly, so there's no clean way of ensuring that the file will be closed on exception without redesigning how |
Agreed. As you said, sphinx is not designed to be able to invoke parallelly. About locking, period from |
The problem is that we are actually using two different builders (manpage and HTML), so
Well, lack of parallelism is not that much of an issue as the risk of race condition resulting in malformed documents or a crash. |
fwiw, this affects https://github.com/pazz/alot as well. I maintain the aur package and am currently just passing -j1 to make. -j1 isn't a big deal for alot, because it's only using make for the docs, but it took a little bit of work to track down what was going on, and the output of a user's build log suggests reporting a bug if for nothing else better error messages: |
This is affecting the QEMU project as well. See the build failure at: |
sphinx-build is buggy when multiple processes are using the same doctree directory in parallel. See the 3-year-old Sphinx bug report at: sphinx-doc/sphinx#2946 Instead of avoiding parallel builds or adding some kind of locking, I'm using the simplest solution: just using a different doctree cache for each builder. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
sphinx-build is buggy when multiple processes are using the same doctree directory in parallel. See the 3-year-old Sphinx bug report at: sphinx-doc/sphinx#2946 Instead of avoiding parallel builds or adding some kind of locking, I'm using the simplest solution: just using a different doctree cache for each builder. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20191014150133.14318-1-ehabkost@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Projects using Sphinx (and Makefiles) often define multiple targets for different doc builders. For example, the LLVM project defines a target for manpages and for HTML docs. Since those targets are independent, make is allowed to run them in parallel (and does that). Since they use the same source tree, they use the same doctree root as well.
Lately, I've noticed that the two parallel Sphinx instances 'fight' over the doctree. Basically, it seems that they both ignore each other and create the doctrees independently, i.e. the same files are first written by one Sphinx instance, and afterwards overwritten by the second. While normally this doesn't cause problems (except for being awfully inefficient), I've actually seen a race condition causing major failure (sorry, don't have any logs for it). Basically, one Sphinx instance has opened one of the files for reading while the other was writing it.
We've attempted to find a good solution on the build system side but could fine none satisfactory. If we switched to separate doctrees, that would be a waste of a good caching opportunity. Getting two independent targets serialized is not trivial in CMake — and LLVM people aren't happy with me adding fake dependencies to enforce serialization. But even if we did that, there's still the general problem that other people will be unaware of the issue and will be hitting it in the future.
Therefore, I think it would be best to fix it on Sphinx side. The simplest solution would be to lock some file in the doctree directory. Considering the specific design, the environment pickle should be the best file to lock. However, that would require redesigning the application to keep the file open rather than closing it immediately after reading/writing (which would also be beneficial to avoiding race conditions).
The text was updated successfully, but these errors were encountered: