-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch Python Binding to pybind11 Instead of Boost.Python #293
Comments
I would be glad to help you for this. |
That would be sweet. Needing to install boost is the number 1 problem new dlib users have. So this would be a very well received update :) I've just looked at pybind11 but haven't done anything yet. |
@edmBernard Are you still working on this? We still get almost daily questions from people having trouble installing boost.python (e.g. #473) so switching to pybind11 would be pretty sweet :) |
sorry I don't already got time to moving forward on this task but that's still on my project. |
Yeah I saw that was missing. Guess it's not a simple find/replace. Still totally worth it though :) |
yeah I thinks too. |
with the diff : here |
I begin to switch to pybind11, my change until now :
I need to replace : boost::python::vector_indexing_suite
boost::python::throw_error_already_set
boost::python::extract<T>().check() // I made a try/catch to replace that but that's not clean my change are here almost all change are just replace |
after some reflexion, I think it's will be easier to use standard conversion already implemented in pybind11 for basic type like eigen/matrix, vector. |
I'm fine with whatever pybind11 does for std::vector, but I definitely don't want to be converting back and forth between dlib::matrix and eigen's matrix. That will make the interface unnecessarily computationally expensive. I also don't want to add another dependency (eigen). Are you sure dlib::matrix even uses the vector_indexing_suite? I don't think it does. |
I see |
Is there some tests for the python binding ? |
It's used in a few places, but I don't think for the matrix. There are a huge number of C++ tests in dlib/test. But nothing for python (other than the example programs). Which is kinda bad. |
some progression : |
@davisking Is there a reason why you use |
I don't remember. It's probably due to some limitation in boost.python.
|
Is anyone working on this? I did some unrelated evaluations on my own and now made some progress. The pybind11 binding builds an so, i am currently testing this against the python examples... i'm quite optimistic at this point. @edmBernard can i take this over? |
As far as I am aware no one is currently working on this. So if you want to do it that would be super. The need to install boost is a huge pain point for many dlib python users. |
Sorry for that, I forgot a bit this binding :( , There is lot's of work and I don't know enough pybind11 at some point. @davisking I don't know if there is evolution, we also need to find a alternative to boost.serializer. |
Great. I have adapted all currently available modules to pybind11 and it builds without warnings and generates a shared library (on recent pybind11 + python 3.6). I will check against your progress @edmBernard, thank you. At this point my preliminary todo items are:
My questions to you @davisking at this point would be:
|
Awesome. Yes, please squash your commits to a single commit. Also make sure there aren't whitespace changes or other non-substantive changes in there. Sometimes people submit PRs that contain a whole lot of non-changes which makes them difficult to review. Running the examples isn't enough testing. There are parts of the API that don't have any examples and there isn't a unit test suite on the python side, which is not great. There is a huge unit test suite in C++, but that doesn't help us in this case. Anyway, just try your best to make sure the whole API is working. Adding explicit unit tests for python would be cool if you are willing, but I wouldn't require it. You can make the pull request whenever you want. But I don't want to spend much time reviewing it until you think it's fully working. So if you make a PR then let me know when it's ready for review. Or you can wait, however, you want to do it is fine. I would almost certainly bundle pybind11 inside dlib/external. If it was a separate download I'm certain many users would mess that up. I went through the same thing with libpng and libjpeg, which are very easily installed but users just kept complaining about them until I finally put copies in dlib/external. |
I chose to make a preliminary PR and will let you know when it's ready for review. Regarding testing: I think starting with a pytest suite is the way to go. Since one has to implement quite a few python scripts anyway to do the kind of testing you'd like to see. The overhead of writing a pytest instead of an manual/interactive test session is relatively low. But writing a test (whether in pytest or by doing manual runs) for all modules is a lot of work. So my questions are:
My approach would be to start with the core modules (basic, vector, matrix, image, etc.) and focus on data conversion, compatibility with the vector_indexing_suite and if the bindings actually satisfy the docstrings (what parameter-values, -types and -order are allowed). And work on use-case modules (detectors, svm, etc.) in a second step since those are much better covered by the examples. Does that make sense? |
The most popular parts of dlib's python API are the modules with example programs. Beyond that, I don't think there is much to prioritize any particular part. Maybe the cca routines. But most things are used by the examples. Yes, certainly the main goal is to not break existing python scripts. If you are willing to setup the python unit tests that would be amazing. :) Yes, definitely the approach you are suggesting is the way to go. |
Also, when you are ready I'll see if I can get some other dlib users to try out your PR and see if it breaks anything. |
Yes, some additional testing by other users would be very useful. I'm still going through the code to fix a few issues and will start next week with the unit-tests. I'm not sure if we should wait with the review much longer than end of next week though, or at least at that point i will have a clear idea if anything unexpected has been popping up. I want to have the basic testing environment running before you give the PR to other users so we can implement their feedback as regressions tests. |
Sounds good. Let me know when you are ready and I'll send out the word. |
I found one backward-breaking issue yesterday during testing of the serialization. Within the python examples this was never really exposed, there is only a single dump that is never loaded. According to http://pybind11.readthedocs.io/en/stable/advanced/classes.html#pickling-support and my tests support this:
I hope at this point, that this is not a show-stopper. |
What's the issue exactly, just that with pybind11 if you call dumps() and
don't specify a protocol version (or give 1) it will just crash? Is that
only on python 2.7 while working correctly in python 3?
Does it really just crash the python interpreter? If we can make it give
an error rather than crashing, or even worse, silently corrupting data,
that would be fine.
|
In my case it segfaults the python interpreter and i think there is no way around it (except of course investing the time in fixing the issue in pybind11). It only happens in python 2.7... i have not seen it happen once in python 3. |
Also have a look here: pybind/pybind11#186 |
Ok, don't worry about it. We can just update the documentation to warn
about this and I think it will be alright. It's a small price to pay to
remove the dependency on boost.
|
Ran into this problem and it was very hard to find the reason until I arrived here. Can a warning be printed when pickle is used or cPickle with the wrong protocol version? |
We discussed this when switching to pybind11 but were unable to find a way to make it print a warning. It seems like it ought to be possible though. But yes, it's not great. |
I've recently found out about pybind11, which is essentially a header only reimplementation of Boost.Python with C++11, and without any dependency on boost. It also has superior documentation compared to Boost.Python, and while it's API is very similar to Boost.Python it does seem a little cleaner.
Over the last year, it seems like the vast majority of people complaining about difficulties building dlib have actually been complaining about difficulties installing/building boost. So it would be great to get rid of Boost.Python and use pybind11. It doesn't look like too much work since the pybind11 API is very similar to Boost.Python.
The text was updated successfully, but these errors were encountered: