Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kDLBool type #114

Merged
merged 5 commits into from
Dec 4, 2022
Merged

Add kDLBool type #114

merged 5 commits into from
Dec 4, 2022

Conversation

leofang
Copy link
Contributor

@leofang leofang commented Nov 10, 2022

Close #75. Close #76. Supersedes #76.

It turns out we all forgot @alonre24 has already pushed a PR (#76), so all I did was minor edits in the docstrings and sync with the latest master, with his commit preserved.

cc: @tqchen

@alonre24
Copy link
Contributor

Close #75. Close #76. Supersedes #76.

It turns out we all forgot @alonre24 has already pushed a PR (#76), so all I did was minor edits in the docstrings and sync with the latest master, with his commit preserved.

cc: @tqchen

LGTM
Glad that this suggestion has finally approved :)

include/dlpack/dlpack.h Outdated Show resolved Hide resolved
include/dlpack/dlpack.h Outdated Show resolved Hide resolved
@leofang
Copy link
Contributor Author

leofang commented Dec 4, 2022

@tqchen shall we merge and make a new release (before #113 is merged)?

@tqchen tqchen merged commit 198fb62 into dmlc:main Dec 4, 2022
@leofang leofang deleted the Support_bool_type branch December 5, 2022 00:03
@seberg
Copy link
Contributor

seberg commented Dec 8, 2022

@tqchen was it correct that if we use the new Bool dtype with a size of 1-bit, than the storage would be assumed to actually be packed into a single bit?
Or was the intention that bit-size is padded to full bytes?!

There is some need for passing bit-masks around in the dataframe community, so it would be good to clarify this use-case, if valid.

@tqchen
Copy link
Member

tqchen commented Dec 8, 2022

in this case i think it should be ideally represented as Bool(bit=1, lanes=32), which represent a 32bit bitmask. The total size of the datatype can still be aligned to bytes

I am not too sure if we want to specify Bool(bit=1, lanes=1), since that can be machine dependent. On a 8bit-byte machine I guess we might still want to pad to minimum byte. But say if a machine have bit-level addressing then it would be onebit.

@seberg
Copy link
Contributor

seberg commented Dec 8, 2022

@tqchen to be honest, I have always had trouble understanding how lanes are to be used. If I have say a 1111 elements in my array and bits does it work to say that size=1111 but dtype=Bool(bit=1, lanes=8) (or something larger, since we mainly want to signal that its byte stored clearly)?

@tqchen
Copy link
Member

tqchen commented Dec 8, 2022

The lane represents the lanes of the unit-data type.

Say we want to store a bit mas, which is represented by int32. To store 65 bits, we will need 3 integers, in this case, it is

array(dtype=Bool(bits=1, lanes=32), size=3)

in the low level, so we have 3 * 32 bits in total.

If we store bool as a normal byte, and to store 65 bools, we need

array(dtype=Bool(bits=8, lanes=1), size=65)

@seberg
Copy link
Contributor

seberg commented Dec 8, 2022

@tqchen but that is a problem, because how do I pass the 65 bits information there since 3 * 32 > 65 and the 65 is vital information!?

@tqchen
Copy link
Member

tqchen commented Dec 8, 2022

@seberg I get what you mean. I feel that could be something being addressed by enhancing the array information to include sub-byte boundary information. I am mainly describing what is being interpreted from the spec right now in a way that is also mostly consistent with compilers like LLVM

@seberg
Copy link
Contributor

seberg commented Dec 9, 2022

I guess that the use-case I was asking for could use/abuse "lanes", because there is a side-channel to pass the actual shape. But it doesn't feel ideal to me, so I am wondering if we can think of a pragmatic way to make this possible.
(To be honest, I have never seen "lanes" used, or dtypes relying on being padded to byte storage.)

@seberg
Copy link
Contributor

seberg commented Dec 12, 2022

Maybe it would make sense to either introduce a new dtype or some sort of flag for bitmasks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DISCUSS] Support kDLBool type
4 participants