Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .keys and .values to bind_map #3310

Merged
merged 4 commits into from
Oct 1, 2021

Conversation

bmerry
Copy link
Contributor

@bmerry bmerry commented Sep 27, 2021

Description

Both of these implement views (rather than just iterators), and .items
is also upgraded to a view. In practical terms, this allows a view to be
iterated multiple times and have its size taken, neither of which works
with an iterator.

The views implement __len__, __iter__, and the keys view implements
__contains__. Testing membership also works in item and value views
because Python falls back to iteration. This won't be optimal
for item values since it's linear rather than O(log n) or O(1), but I
didn't fancy trying to get all the corner cases to match Python
behaviour (tuple of wrong types, wrong length tuple, not a tuple etc).

Missing relative to Python dictionary views is __reversed__ (only
added to Python in 3.8). Implementing that could break code that binds
custom map classes which don't provide rbegin/rend (at least without
doing clever things with SFINAE), so I've not tried.

The size increase on my system is 131072 bytes, which is rather large
(5%) but also suspiciously round (2^17) and makes me suspect some
quantisation effect.

Suggested changelog entry:

Improve the classes generated by ``bind_map``:
- Change ``.items`` from an iterator to a dictionary view.
- Add ``.keys`` and ``.values`` (both dictionary views).
- Allow ``__contains__`` to take any object.

@bmerry
Copy link
Contributor Author

bmerry commented Sep 27, 2021

CC @henryiii @rainwoodman since this is a follow-on to #3271 / #3293.

@bmerry
Copy link
Contributor Author

bmerry commented Sep 27, 2021

I've realised that one limitation of the __contains__ implementation is that testing presence of an object of the wrong type will raise TypeError rather than returning false. That's consistent with what the map class itself currently does, but inconsistent with Python (even with type annotations: typeshed says that __contains__ takes an object rather than the key type). What do you think of adding a fallback implementation of __contains__ to both keys_view and the mapping class itself which returns false when given a general object?

@henryiii
Copy link
Collaborator

That sounds reasonable to me. You could do a cast and have one impl, or have two.

@henryiii
Copy link
Collaborator

Wait, what increases in size? This should only affect code that uses std_bind to bind maps? Are you just looking at the test size increase (which will be larger due to more tests)?

@bmerry
Copy link
Contributor Author

bmerry commented Sep 27, 2021

Wait, what increases in size? This should only affect code that uses std_bind to bind maps? Are you just looking at the test size increase (which will be larger due to more tests)?

Yes, the shared library for the tests. There isn't actually any new code C++ code for tests (there is new code in the .py files to test the extra bindings).

@henryiii
Copy link
Collaborator

But only code that uses this increases in size (which the tests do), so personally, I think it's probably fine.

@bmerry bmerry force-pushed the bind-map-keys-values branch from 571bff8 to a32db87 Compare September 29, 2021 07:30
@henryiii
Copy link
Collaborator

henryiii commented Sep 30, 2021

Were you going to handle arbitrary inputs to __contains__? Say with a py::object argument that you then cast, and return false otherwise?

@bmerry
Copy link
Contributor Author

bmerry commented Sep 30, 2021

Were you going to handle arbitrary inputs to contains? Say with a py::object argument that you then cast, and return false otherwise?

Thanks for reminding me. Done.

@henryiii
Copy link
Collaborator

Wouldn't this potentially be smaller in binary form and have simpler signatures if you just had a single function with a py::object argument that you then cast, and return false otherwise? What was the effect on the binary test size, by the way?

@bmerry
Copy link
Contributor Author

bmerry commented Sep 30, 2021

Wouldn't this potentially be smaller in binary form and have simpler signatures if you just had a single function with a py::object argument that you then cast, and return false otherwise? What was the effect on the binary test size, by the way?

I'll take a look when I can - unfortunately I have some other commitments for the next week. I went with overloading because I couldn't see a way to do it in one function without a cast_error exception if the object isn't a KeyType, and using exceptions for non-error control flow is frowned on in C++. I'm also not familiar enough with the differences between casting in overload resolution versus py::cast to be sure they're equivalent (when it comes to things like converting versus non-converting casts).

@henryiii
Copy link
Collaborator

Okay, let's go with this unless someone wants to try something else. It could later be optimized if that does end up helping. The goal is to release 2.8 on or before Python 3.10 (Monday), and this is the one feature that would be nice to get in, since it builds on the existing ones. This technically was started before our feature freeze on Monday. @rwgk or @Skylion007 what do you think? We should also get one more Google test run with this in before releasing this weekend / Monday.

@Skylion007
Copy link
Collaborator

On one hand, it was started before our feature freeze. On the other hand, this new functionality could add breaking functionality, @rwgk What does Google Testing say about this PR?

@rwgk
Copy link
Collaborator

rwgk commented Sep 30, 2021

On one hand, it was started before our feature freeze. On the other hand, this new functionality could add breaking functionality, @rwgk What does Google Testing say about this PR?

I support this PR and will run the Google testing asap (today, result late tonight or tomorrow morning).

bmerry and others added 4 commits September 30, 2021 12:26
Both of these implement views (rather than just iterators), and `.items`
is also upgraded to a view. In practical terms, this allows a view to be
iterated multiple times and have its size taken, neither of which works
with an iterator.

The views implement `__len__`, `__iter__`, and the keys view implements
`__contains__`. Testing membership also works in item and value views
because Python falls back to iteration. This won't be optimal
for item values since it's linear rather than O(log n) or O(1), but I
didn't fancy trying to get all the corner cases to match Python
behaviour (tuple of wrong types, wrong length tuple, not a tuple etc).

Missing relative to Python dictionary views is `__reversed__` (only
added to Python in 3.8). Implementing that could break code that binds
custom map classes which don't provide `rbegin`/`rend` (at least without
doing clever things with SFINAE), so I've not tried.

The size increase on my system is 131072 bytes, which is rather large
(5%) but also suspiciously round (2^17) and makes me suspect some
quantisation effect.
Add extra overload of `__contains__` (for both the map itself and
KeysView) which takes an arbitrary object and returns false.
@rwgk rwgk force-pushed the bind-map-keys-values branch from b03ce2a to 9422f17 Compare September 30, 2021 19:27
@rwgk
Copy link
Collaborator

rwgk commented Sep 30, 2021

Internally Google has a special way of handling header files, which exposed this small oversight:

ERROR: /google/src/cloud/rwgk/pybind11_pr_testing/google3/third_party/pybind11/BUILD:36:11: Compiling third_party/pybind11/include/pybind11/stl_bind.h failed: (Exit 1) wrapped_clang failed: error executing command third_party/crosstool/v18/stable/toolchain/bin/wrapped_clang '-DSOABI="cpython-37m-x86_64-linux-gnu"' '-DABIFLAGS="m"' '-DMULTIARCH="x86_64-linux-gnu"' -iquote . -iquote ... (remaining 240 arguments skipped).  [forge_remote_host=jatb4]
./third_party/pybind11/include/pybind11/stl_bind.h:638:5: error: use of undeclared identifier 'py'
    py::class_<KeysView> keys_view(
    ^
... (many more) ...

Interactive testing (before submitting for global testing) passes after removing 5 py::. I pushed the change to this PR, CI is running. I also rebased on current master.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Google tests look good. The code looks good to me, too. I support including this in the 2.8 release, but I think it's best to leave it up to @henryiii to merge.

@henryiii
Copy link
Collaborator

henryiii commented Oct 1, 2021

I think it's fine to go in. Let's view master as 2.8 RC. Full version will come out Monday.

@henryiii henryiii merged commit b3573ac into pybind:master Oct 1, 2021
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Oct 1, 2021
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Oct 4, 2021
paulinus added a commit to paulinus/OpenSfM that referenced this pull request Nov 15, 2021
Summary:
`Map` data such as `shots` is exposed through view classes. We need to keep the map alive as long as some of its objects are alive in python.  For this to happen, we need
- make the views keep alive the map
- make the shots keep alive the view

A better solution will be to use pybind's own bindings for `unordered_map`. However, this is only complete in pybind 2.8 pybind/pybind11#3310

Differential Revision: D32401524

fbshipit-source-id: 3fc30e710a2151dbb5097de8fe685ddfa80df675
facebook-github-bot pushed a commit to mapillary/OpenSfM that referenced this pull request Nov 15, 2021
Summary:
Pull Request resolved: #822

`Map` data such as `shots` is exposed through view classes. We need to keep the map alive as long as some of its objects are alive in python.  For this to happen, we need
- make the views keep alive the map
- make the shots keep alive the view

A better solution will be to use pybind's own bindings for `unordered_map`. However, this is only complete in pybind 2.8 pybind/pybind11#3310

Reviewed By: fabianschenk

Differential Revision: D32401524

fbshipit-source-id: 126141a7f02abe27fd376ae63fc788e87565dc0e
Comment on lines +638 to +643
class_<KeysView> keys_view(
scope, ("KeysView[" + name + "]").c_str(), pybind11::module_local(local));
class_<ValuesView> values_view(
scope, ("ValuesView[" + name + "]").c_str(), pybind11::module_local(local));
class_<ItemsView> items_view(
scope, ("ItemsView[" + name + "]").c_str(), pybind11::module_local(local));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming seems to produce "erroneous" names for stub generation and subsequent parsing by Pylance or mypy, etc... Also sphinx autodoc has issues parsing it:

reading sources... [ 40%] references/python
WARNING: invalid signature for autoclass ('depthai::BoardConfig.ItemsView[GPIOMap]')

Thoughts on renaming to *View_{name}, with an underscore? Seems to work well afterwards.

Example
Before:

class GPIOMap:
    def __init__(self) -> None: ...
    def items(self) -> BoardConfig.ItemsView[GPIOMap]: ...
    def keys(self) -> BoardConfig.KeysView[GPIOMap]: ...
    def values(self) -> BoardConfig.ValuesView[GPIOMap]: ...
    def __bool__(self) -> bool: ...
    @overload
    def __contains__(self, arg0: int) -> bool: ...
    @overload
    def __contains__(self, arg0: object) -> bool: ...
    def __delitem__(self, arg0: int) -> None: ...
    def __getitem__(self, arg0: int) -> BoardConfig.GPIO: ...
    def __iter__(self) -> Iterator: ...
    def __len__(self) -> int: ...
    def __setitem__(self, arg0: int, arg1: BoardConfig.GPIO) -> None: ...

class ItemsView[GPIOMap]:
    def __init__(self, *args, **kwargs) -> None: ...
    def __iter__(self) -> Iterator: ...
    def __len__(self) -> int: ...
...

After:

class GPIOMap:
    def __init__(self) -> None: ...
    def items(self) -> BoardConfig.ItemsView_GPIOMap: ...
    def keys(self) -> BoardConfig.KeysView_GPIOMap: ...
    def values(self) -> BoardConfig.ValuesView_GPIOMap: ...
    def __bool__(self) -> bool: ...
    @overload
    def __contains__(self, arg0: int) -> bool: ...
    @overload
    def __contains__(self, arg0: object) -> bool: ...
    def __delitem__(self, arg0: int) -> None: ...
    def __getitem__(self, arg0: int) -> BoardConfig.GPIO: ...
    def __iter__(self) -> Iterator: ...
    def __len__(self) -> int: ...
    def __setitem__(self, arg0: int, arg1: BoardConfig.GPIO) -> None: ...

class ItemsView_GPIOMap:
    def __init__(self, *args, **kwargs) -> None: ...
    def __iter__(self) -> Iterator: ...
    def __len__(self) -> int: ...
...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@themarpe sounds good. Please add a PR.

Copy link
Collaborator

@Skylion007 Skylion007 Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@themarpe Actually, hold on. Why is it having any issues? It should be generating a typing.KeysView . Is there some other reason here why it is struggling to parse it correctly?

Copy link

@themarpe themarpe Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Skylion007 Thanks - will do so soon.
EDIT: Opened the PR

I think its mainly from the *View[name] naming. I think the square brackets gets at it. Works okay for mypy & Sphinx when renamed to something different

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants