-
Notifications
You must be signed in to change notification settings - Fork 763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster conversion from u128/i128 to PyLong with abi3 #1315
Conversation
71e4eb8
to
07bade4
Compare
Now I understand why I thought this approach is too complex: I was thinking of using |
07bade4
to
a93d97b
Compare
src/types/num.rs
Outdated
let pybytes: &PyBytes = num | ||
.call_method("to_bytes", (bytes.len(), "little"), kwargs(py, $is_signed))? | ||
.call_method("to_bytes", (bytes.len(), "little"), kwargs)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to revert a refactoring, or did you want to change the FromPy part as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since now the kwargs
function is called once, I moved it in the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It's 14 more lines of code total for a more efficient solution so I think this is a desirable change. I expect abi3 builds to become quite common once we have them working.
I think it'd be quite good to have a test here to be sure that we've got the right solution without any edge cases.
proptest
can be pretty nice for tests like this where data is passed in a roundtrip. I think we can have a test like this, for example:
use proptest::prelude::*;
proptest! {
#[test]
fn test_i128_roundtrip(x: i128) {
Python::with_gil(|py| {
let x_py = x.into_py(py);
py_run!(py, x_py, format!("assert x_py == {}", x));
let roundtripped: i128 = x_py.extract().unwrap();
assert_eq!(x, roundtripped);
})
}
}
and similar for u128.
src/types/num.rs
Outdated
.call_method("from_bytes", (bytes, "little"), kwargs(py, $is_signed)) | ||
.expect("Integer conversion (u128/i128 to PyLong) failed") | ||
.into_py(py) | ||
let (first, last) = split_128int(self.to_le_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I suggest we use the terminology lower
/ upper
instead of first
/ last
? For me it's easier to understand which bits are most significant / least significant with that terminology.
let (first, last) = split_128int(self.to_le_bytes()); | |
let (lower, upper) = split_128int(self.to_le_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I wanted better names too 😉
Just a status update: renamed some variables and re-implemented PyLong to i128/u128 conversion using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Couple more suggestions...
src/types/num.rs
Outdated
let le_bytes = self.to_le_bytes(); | ||
let lower = u64::from_le_bytes(slice_to_bytearr(&le_bytes[..BYTE_SIZE / 2])); | ||
let upper = | ||
<$half_type>::from_le_bytes(slice_to_bytearr(&le_bytes[BYTE_SIZE / 2..])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of ideas:
- might want to use
split_at
? - using
TryInto
+unwrap()
is an option here, the compiler can statically verify the slice size so will optimize the panic away:
let le_bytes = self.to_le_bytes(); | |
let lower = u64::from_le_bytes(slice_to_bytearr(&le_bytes[..BYTE_SIZE / 2])); | |
let upper = | |
<$half_type>::from_le_bytes(slice_to_bytearr(&le_bytes[BYTE_SIZE / 2..])); | |
use std::convert::TryInto; | |
let le_bytes = self.to_le_bytes(); | |
let (lower, upper) = le_bytes.split_at(BYTE_SIZE / 2); | |
let lower = u64::from_le_bytes(lower.try_into().unwrap()); | |
let upper = <$half_type>::from_le_bytes(upper.try_into().unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice 👍🏽
py, | ||
ffi::PyNumber_Index(ob.as_ptr()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the call to PyNumber_Index
is still needed before calling PyLong_AsUnsignedLongLongMask
, so that types which implement __index__
are converted to PyLong
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, PyLong_AsUnsigned...
calls it internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah perfect, thanks for checking!
1a1ba74
to
5c1a2f2
Compare
Added an overflow test for i128. |
There should also be a test with values in the |
Agreed; if we're not proptesting we should make sure the tests we do have cover for mistakes like halves being mixed up. We know it's correct now but who knows what will happen next time I try to refactor this code 👀 |
Why are you using all the le bytes methods when you can just use: let v: u128 = ...;
let low_half = v as u64;
let high_half = (v >> 64) as u64; and let v: i128 = ...;
let low_half = v as u64; // note unsigned
let high_half = (v >> 64) as i64; and then on the python side: (high_half << 64) | low_half |
Good test values are: let u128_test_value = 0xFF0102030405060788090A0B0C0D0E0Fu128;
let i128_test_value = u128_test_value as i128; since it tests the halves are in the right order with the right signedness. |
😅 great suggestion, thanks! I think when this implementation started with |
Wow, thanks. It seems like I wrote too many TeX documents to forget some bit hacks 🙄 |
5c1a2f2
to
445d932
Compare
Added a basic proptest but is this sufficient? |
bytes.copy_from_slice(pybytes.as_bytes()); | ||
Ok(<$rust_type>::from_le_bytes(bytes)) | ||
-1 as _, | ||
ffi::PyLong_AsUnsignedLongLongMask(ob.as_ptr()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'd probably want an assertion of some sort that PyLong_UnsignedLongLongMask
returns the expected type, since C/C++ don't guarantee that it's u64
-- it could be u128
or some other type on an unusual OS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm 🤔 , but even the Rust stdlib does not support it https://doc.rust-lang.org/nightly/std/os/raw/type.c_ulonglong.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs are just showing the type used on the system they used to build the docs (x86_64-unknown-linux-gnu
iirc), it doesn't mean it will always be u64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an example, see https://doc.rust-lang.org/nightly/std/os/raw/type.c_long.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the description:
The C standard technically only requires that this type be an unsigned integer with the size of a long long, although in practice, no system would have a long long that is not a u64, as most systems do not have a standardised u128 type.
Also, it has no #[cfg
block:
So I think we don't need to consider the system where ull is u128 for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, yeah. I did check both the AS/400 and RISC-V 128-bit compilers (or at least their ABI proposals/docs) and they both have a 64-bit ull -- AS/400 just doesn't have a 128-bit int type (no intptr_t at all), and RV128 proposes long long long
for 128-bit types.
You accidentally changed |
Btw, sorry to sound all negative, thanks for all your work! |
other than all that, looks good to me! |
No worry, thank you for your comments 👍🏼 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just need to chmod -x src/lib.rs
before merging. Not sure how that happened 🤔
445d932
to
cd7348f
Compare
Thanks!
Maybe my emacs setting does something wrong, but I'm also not sure 😓 |
Fixes #1314
I hope it's correct...
Also, adding 30 lines is acceptable for such a minor feature?