-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Valgrind build on debug Python 3.9 #2746
Conversation
Valgrind being linux-only answers the latter. As for python 3.9.1 only, it's the cleanest CPython ever, so let's not make our own lives harder than they need to be. |
I really don't see the point. The pytest errors that we currently see in the second run, inside valgrind, is due to |
The main idea was that 1) our tests are pretty fast (once compiled) and 2) we don't have any debug build currently, so I thought there was some value in first checking if tests pass, and only then checking Valgrind's output. Maybe once Valgrind is more quiet, it'll be more obvious that Valgrind's output won't mess up the interpretation of the tests too much, though. |
59079c4
to
4889a59
Compare
deadsnakes offers this; we don't need to build anything with pyenv after all! Submitted deadsnakes/action#18 to be able to do this with 3 lines of a GitHub action! |
Maybe not. While we'd get a complete report after each test, it makes testing the start up and shut down impossible. We have had shutdown related leaks before.
Alright, let's keep it for now. |
56289c8
to
b756519
Compare
1c093ea
to
864e920
Compare
There's a few distinct warnings:
|
I'd like to eventfully move over to running PyTest via CTest instead of CMake. If we do that, then a |
We could add a |
Does that support more than one suppressions file? Right now, we have 3 or 4.
That should be easy to do.
Do you mean in addition to the
It could, yes. |
That could works, yes. I'm getting used to the manual version now as well; it's not as bad as it looks, but still.
That's more or less what I was thinking, but if feels a bit like a hack. If it helps to get perspective, fitting in ASAN and UBSAN after this might also become part of the plan. |
Fixed this one!
I'm not seeing this in CI, now, though? |
On a closer look, that one happens during |
Nah, and definitely not in the context of this PR, I'd say. It's also in |
As long as we run the other parts here (cpptest and such), then we can drop the debug build we have on the other one (or change it). |
I was thinking about doing it the other way around, having an few |
9165ceb
to
33bf6ba
Compare
33bf6ba
to
8abf0da
Compare
This seems to be working. Happy with this, @henryiii, @bstaletic? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, everyone. I wasn't at home much during the holyday season.
This seems to be working. Happy with this, @henryiii, @bstaletic?
I am.
Reviewed 1 of 4 files at r7, 1 of 5 files at r9, 1 of 3 files at r23, 2 of 3 files at r24, 3 of 6 files at r26, 1 of 4 files at r27, 2 of 3 files at r28, 1 of 2 files at r30, 1 of 1 files at r31.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @EricCousineau-TRI, @henryiii, and @YannickJadoul)
.github/workflows/ci.yml, line 207 at r29 (raw file):
Previously, henryiii (Henry Schreiner) wrote…
working-directory: valgrind run: | sudo make install sudo apt-get update && sudo apt-get install libc6-dbg # Needed by Valgrind
Done.
include/pybind11/pybind11.h, line 499 at r15 (raw file):
Previously, YannickJadoul (Yannick Jadoul) wrote…
See #2756.
This was fixed, right? If so, please resolve the discussion on Reviewable.
include/pybind11/pybind11.h, line 502 at r19 (raw file):
Previously, YannickJadoul (Yannick Jadoul) wrote…
This can still leak, I have just realized.
It's very annoying that we cannot just have a std::unique_ptr<detail::function_record, void (*)(detail::function_record *)>, with destruct as deleter. The issue is that we cannot do this before all the char *s are strduped, and that 1) is only done in initialize_generic, and 2) the repr calls during strdupcan also throw.
Ditto.
tests/pybind11_tests.h, line 83 at r16 (raw file):
Previously, YannickJadoul (Yannick Jadoul) wrote…
For reference, this was the old code, without
py::exec
(and thus without requiring<pybind11/eval.h>
to be included inpybind11_tests.h
and thus in all tests).@bstaletic slightly prefers this, and I think I do to. However, curious to learn what other reviewers think.
/// Simplified ``with warnigns.catch_warnings()`` wrapper template <typename F> void ignoreOldStyleInitWarnings(F &&body) { auto message = "pybind11-bound class '.+' is using an old-style placement-new '(?:__init__|__setstate__)' which has been deprecated"; auto category = py::reinterpret_borrow<py::object>(PyExc_FutureWarning); auto warnings = py::module_::import("warnings"); auto context_mgr = warnings.attr("catch_warnings")(); context_mgr.attr("__enter__")(); warnings.attr("filterwarnings")("ignore", py::arg("message")=message, py::arg("category")=category); body(); // Exceptions in `body` not handled; see PEP 343 when these would need to be added context_mgr.attr("__exit__")(py::none(), py::none(), py::none()); }
This discussion should be "resolved"/decided. Right?
tests/test_virtual_functions.py, line 254 at r13 (raw file):
Previously, YannickJadoul (Yannick Jadoul) wrote…
Here you go:
=11345== Invalid read of size 1 ==11345== at 0x47CC63: subtype_dealloc (typeobject.c:1343) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x456E22: _Py_DECREF (object.h:430) ==11345== by 0x456E22: _Py_XDECREF (object.h:497) ==11345== by 0x456E22: free_keys_object (dictobject.c:598) ==11345== by 0x456D1D: dictkeys_decref (dictobject.c:333) ==11345== by 0x456D1D: dict_dealloc (dictobject.c:2026) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x43CBCF: _Py_DECREF (object.h:430) ==11345== by 0x43CB1B: frame_dealloc (frameobject.c:591) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x51E9E5: _Py_DECREF (object.h:430) ==11345== by 0x51E9E5: _Py_XDECREF (object.h:497) ==11345== by 0x51E9E5: tb_dealloc (traceback.c:167) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x432CEA: _Py_DECREF (object.h:430) ==11345== by 0x432CEA: BaseException_clear (exceptions.c:78) ==11345== by 0x4342E6: BaseException_dealloc (exceptions.c:88) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x4742E6: _Py_DECREF (object.h:430) ==11345== by 0x4742E6: _Py_XDECREF (object.h:497) ==11345== by 0x4742E6: tupledealloc (tupleobject.c:237) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x456CBB: _Py_DECREF (object.h:430) ==11345== by 0x456CBB: _Py_XDECREF (object.h:497) ==11345== by 0x456CBB: dict_dealloc (dictobject.c:2018) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x476595: _Py_DECREF (object.h:430) ==11345== by 0x47F1F1: subtype_clear (typeobject.c:1186) ==11345== by 0x52E694: delete_garbage (gcmodule.c:1004) ==11345== by 0x52D806: collect (gcmodule.c:1273) ==11345== by 0x52D28F: collect_with_callback (gcmodule.c:1387) ==11345== by 0x52D23C: collect_generations (gcmodule.c:1442) ==11345== by 0x52CF2A: _PyObject_GC_Alloc (gcmodule.c:2242) ==11345== by 0x52CE47: _PyObject_GC_Malloc (gcmodule.c:2252) ==11345== Address 0x77b12b9 is 185 bytes inside a block of size 936 free'd ==11345== at 0x4C33078: free (vg_replace_malloc.c:538) ==11345== by 0x467EA2: _PyMem_RawFree (obmalloc.c:127) ==11345== by 0x468F91: PyObject_Free (obmalloc.c:709) ==11345== by 0x52D1C0: PyObject_GC_Del (gcmodule.c:2317) ==11345== by 0x47EFE4: type_dealloc (typeobject.c:3486) ==11345== by 0xA92DA84: pybind11_meta_dealloc (class.h:231) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0xA9220FB: _Py_DECREF (object.h:430) ==11345== by 0xA92E6FC: pybind11_object_dealloc (class.h:436) ==11345== by 0x47CC62: subtype_dealloc (typeobject.c:1337) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x456E22: _Py_DECREF (object.h:430) ==11345== by 0x456E22: _Py_XDECREF (object.h:497) ==11345== by 0x456E22: free_keys_object (dictobject.c:598) ==11345== by 0x456D1D: dictkeys_decref (dictobject.c:333) ==11345== by 0x456D1D: dict_dealloc (dictobject.c:2026) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x43CBCF: _Py_DECREF (object.h:430) ==11345== by 0x43CB1B: frame_dealloc (frameobject.c:591) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x51E9E5: _Py_DECREF (object.h:430) ==11345== by 0x51E9E5: _Py_XDECREF (object.h:497) ==11345== by 0x51E9E5: tb_dealloc (traceback.c:167) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x432CEA: _Py_DECREF (object.h:430) ==11345== by 0x432CEA: BaseException_clear (exceptions.c:78) ==11345== by 0x4342E6: BaseException_dealloc (exceptions.c:88) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x4742E6: _Py_DECREF (object.h:430) ==11345== by 0x4742E6: _Py_XDECREF (object.h:497) ==11345== by 0x4742E6: tupledealloc (tupleobject.c:237) ==11345== by 0x4641AD: _Py_Dealloc (object.c:2209) ==11345== by 0x456CBB: _Py_DECREF (object.h:430) ==11345== by 0x456CBB: _Py_XDECREF (object.h:497) ==11345== by 0x456CBB: dict_dealloc (dictobject.c:2018) ==11345== Block was alloc'd at ==11345== at 0x4C31ECB: malloc (vg_replace_malloc.c:307) ==11345== by 0x467566: _PyMem_RawMalloc (obmalloc.c:99) ==11345== by 0x468168: PyObject_Malloc (obmalloc.c:685) ==11345== by 0x52CE7B: _PyObject_GC_Alloc (gcmodule.c:2225) ==11345== by 0x52CE47: _PyObject_GC_Malloc (gcmodule.c:2252) ==11345== by 0x47638A: PyType_GenericAlloc (typeobject.c:1047) ==11345== by 0x479D4A: type_new (typeobject.c:2596) ==11345== by 0x4799EC: type_call (typeobject.c:1014) ==11345== by 0x42DE98: _PyObject_MakeTpCall (call.c:191) ==11345== by 0x42DC17: _PyObject_FastCallDictTstate (call.c:113) ==11345== by 0x42DA26: PyObject_VectorcallDict (call.c:142) ==11345== by 0x6567F1: builtin___build_class__ (bltinmodule.c:232) ==11345== by 0x62B2AC: cfunction_vectorcall_FASTCALL_KEYWORDS (methodobject.c:442) ==11345== by 0x4D5792: _PyObject_VectorcallTstate (abstract.h:118) ==11345== by 0x4D5792: PyObject_Vectorcall (abstract.h:127) ==11345== by 0x4D5792: call_function (ceval.c:5072) ==11345== by 0x4D5792: _PyEval_EvalFrameDefault (ceval.c:3518) ==11345== by 0x4CBDF5: _PyEval_EvalFrame (pycore_ceval.h:40) ==11345== by 0x4CBDF5: _PyEval_EvalCode (ceval.c:4327) ==11345== by 0x42E767: _PyFunction_Vectorcall (call.c:395) ==11345== by 0x42F6F7: PyVectorcall_Call (call.c:242) ==11345== by 0x42F4A9: _PyObject_Call (call.c:265) ==11345== by 0x42F3F9: PyObject_Call (call.c:292) ==11345== by 0x4D7B7D: do_call_core (ceval.c:5120) ==11345== by 0x4D5F24: _PyEval_EvalFrameDefault (ceval.c:3580) ==11345== by 0x4CBDF5: _PyEval_EvalFrame (pycore_ceval.h:40) ==11345== by 0x4CBDF5: _PyEval_EvalCode (ceval.c:4327) ==11345== by 0x42E767: _PyFunction_Vectorcall (call.c:395) ==11345== by 0x42F7AD: PyVectorcall_Call (call.c:230) ==11345== by 0x42F4A9: _PyObject_Call (call.c:265)
One free was from within pybind11's metaclass. The other was at shutdown when cpython is cleaning up after itself.
tests/valgrind-numpy-scipy.supp, line 2 at r19 (raw file):
Previously, YannickJadoul (Yannick Jadoul) wrote…
Definitely! I just don't believe either @bstaletic or I have looked further than "this already occurs just importing NumPy, so let's suppress these".
Right. Reporting bugs to CPython and/or NumPy shouldn't block this pull request, however. (As long as they aren't actual blockers.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @bstaletic, @EricCousineau-TRI, and @henryiii)
.github/workflows/ci.yml, line 197 at r18 (raw file):
Previously, henryiii (Henry Schreiner) wrote…
Let's pull this into three steps, one that downloads and one that compiles valgrind if the cache misses, and one that (always) installs valgrind. Then we can use working-directory and drop the pushd/popd.
Done by @henryiii
.github/workflows/ci.yml, line 198 at r29 (raw file):
Previously, henryiii (Henry Schreiner) wrote…
Also, technically, I think you could also just cd then not go back, since you don't use the outer directory in the next line
Done by @henryiii
include/pybind11/pybind11.h, line 499 at r15 (raw file):
Previously, bstaletic (Boris Staletic) wrote…
This was fixed, right? If so, please resolve the discussion on Reviewable.
Yes, indeed!
include/pybind11/pybind11.h, line 231 at r19 (raw file):
Previously, YannickJadoul (Yannick Jadoul) wrote…
Hmmm, myes, hadn't thought of that yes. But to
common.h
, then?
If you ask me, I'd only do this when we'd need to (because there's another use case), though?
Let's consider this resolved. If other things need this, we can still move it to common.h
at a later point?
include/pybind11/pybind11.h, line 502 at r19 (raw file):
Previously, bstaletic (Boris Staletic) wrote…
Ditto.
Yes, thanks! Cutting out these other PRs has left this in a bit of a weird state :-)
tests/pybind11_tests.h, line 83 at r16 (raw file):
Previously, bstaletic (Boris Staletic) wrote…
This discussion should be "resolved"/decided. Right?
It's fine like this, I think!
tests/test_virtual_functions.py, line 254 at r13 (raw file):
Previously, bstaletic (Boris Staletic) wrote…
One free was from within pybind11's metaclass. The other was at shutdown when cpython is cleaning up after itself.
I'll resolve this, but once the PR gets merged, we'll still revert this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @bstaletic, @EricCousineau-TRI, and @henryiii)
Not at all :-) You're the one that keeps telling me "open source, no deadlines" ;-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @EricCousineau-TRI and @henryiii)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does my approval count?
Yeeeeeeeeeeah. At least the two of us together share a full approval, no? For the things the other has done :-) This is just CI and CMake anyway, and @henryiii approved, and @EricCousineau-TRI's comments were handled in #2756, so .. let's get Valgrind support for all PRs ;-) |
Thanks a lot for all the shared work, @bstaletic! :-) |
Congratulations and thanks a lot @YannickJadoul & @bstaletic! This is a game changer for future developments. |
Thanks! :-) Let's hope so. Almost forgot (since that's longer ago), but thanks for all your reviews as well, @rwgk! |
Follow-up in #2797. Do we also want to report the leaks to Python/NumPy? We could create an issue until someone feels like it? @EricCousineau-TRI suggested this, and @bstaletic reacted. What do you think? |
Description
Finally, together with @bstaletic, we've finally taken the time and effort to add a build with Valgrind; currently still failing, because of #2742.
We might also need to suppress some Python errors reported by Valgrind, but let's take it step by step.
Things to fix:
Things to decide:
Things to tweak in CMake/CI config:
valgrind cmake --build build --target pytest
doesn't work (we don't want to run Valgrind on CMake). Do we want to include this into the CMake config?The fact thatvalgrind python
also doesn't work withpyenv
(it's bash script wrapper/launcher), and we need to use this$(pyenv which python)
doesn't make things simpler either.Oh, yes, and:
Things for follow-up PRs:
test_dispatch_issue
workarounds and debug and fix the underlying issue.Closes #2578
Suggested changelog entry:
(@henryiii, is this something we actually want to add to add to the changelog?)
This change is