Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster conversion from u128/i128 to PyLong with abi3 #1315

Merged
merged 4 commits into from
Dec 19, 2020

Conversation

kngwyu
Copy link
Member

@kngwyu kngwyu commented Dec 13, 2020

Fixes #1314
I hope it's correct...

Also, adding 30 lines is acceptable for such a minor feature?

@kngwyu kngwyu force-pushed the abi3-128bit-integer branch 2 times, most recently from 71e4eb8 to 07bade4 Compare December 13, 2020 08:45
@kngwyu
Copy link
Member Author

kngwyu commented Dec 13, 2020

Now I understand why I thought this approach is too complex: I was thinking of using lshift and add, but lshift and or are better.

src/types/num.rs Outdated
let pybytes: &PyBytes = num
.call_method("to_bytes", (bytes.len(), "little"), kwargs(py, $is_signed))?
.call_method("to_bytes", (bytes.len(), "little"), kwargs)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to revert a refactoring, or did you want to change the FromPy part as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since now the kwargs function is called once, I moved it in the function.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It's 14 more lines of code total for a more efficient solution so I think this is a desirable change. I expect abi3 builds to become quite common once we have them working.

I think it'd be quite good to have a test here to be sure that we've got the right solution without any edge cases.

proptest can be pretty nice for tests like this where data is passed in a roundtrip. I think we can have a test like this, for example:

use proptest::prelude::*;

proptest! {
    #[test]
    fn test_i128_roundtrip(x: i128) {
        Python::with_gil(|py| {
            let x_py = x.into_py(py);
            py_run!(py, x_py, format!("assert x_py == {}", x));
            let roundtripped: i128 = x_py.extract().unwrap();
            assert_eq!(x, roundtripped);
        })
    }
}

and similar for u128.

src/types/num.rs Outdated
.call_method("from_bytes", (bytes, "little"), kwargs(py, $is_signed))
.expect("Integer conversion (u128/i128 to PyLong) failed")
.into_py(py)
let (first, last) = split_128int(self.to_le_bytes());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I suggest we use the terminology lower / upper instead of first / last? For me it's easier to understand which bits are most significant / least significant with that terminology.

Suggested change
let (first, last) = split_128int(self.to_le_bytes());
let (lower, upper) = split_128int(self.to_le_bytes());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wanted better names too 😉

@kngwyu
Copy link
Member Author

kngwyu commented Dec 13, 2020

Just a status update: renamed some variables and re-implemented PyLong to i128/u128 conversion using PyLong_AsUnsignedLongLongMask and shift.
Proptest looks good but I'm still reading the document.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Couple more suggestions...

src/types/num.rs Outdated
Comment on lines 244 to 247
let le_bytes = self.to_le_bytes();
let lower = u64::from_le_bytes(slice_to_bytearr(&le_bytes[..BYTE_SIZE / 2]));
let upper =
<$half_type>::from_le_bytes(slice_to_bytearr(&le_bytes[BYTE_SIZE / 2..]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of ideas:

  • might want to use split_at ?
  • using TryInto + unwrap() is an option here, the compiler can statically verify the slice size so will optimize the panic away:
Suggested change
let le_bytes = self.to_le_bytes();
let lower = u64::from_le_bytes(slice_to_bytearr(&le_bytes[..BYTE_SIZE / 2]));
let upper =
<$half_type>::from_le_bytes(slice_to_bytearr(&le_bytes[BYTE_SIZE / 2..]));
use std::convert::TryInto;
let le_bytes = self.to_le_bytes();
let (lower, upper) = le_bytes.split_at(BYTE_SIZE / 2);
let lower = u64::from_le_bytes(lower.try_into().unwrap());
let upper = <$half_type>::from_le_bytes(upper.try_into().unwrap());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice 👍🏽

py,
ffi::PyNumber_Index(ob.as_ptr()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the call to PyNumber_Index is still needed before calling PyLong_AsUnsignedLongLongMask, so that types which implement __index__ are converted to PyLong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, PyLong_AsUnsigned... calls it internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah perfect, thanks for checking!

@kngwyu
Copy link
Member Author

kngwyu commented Dec 13, 2020

Added an overflow test for i128.
And for proptest, I think in this case max/min is sufficiently strong and we don't strongly need randomized tests.

@birkenfeld
Copy link
Member

There should also be a test with values in the i/u64 range, to ensure that the two halves are not mixed up.

@davidhewitt
Copy link
Member

Agreed; if we're not proptesting we should make sure the tests we do have cover for mistakes like halves being mixed up. We know it's correct now but who knows what will happen next time I try to refactor this code 👀

@programmerjake
Copy link
Contributor

programmerjake commented Dec 13, 2020

Why are you using all the le bytes methods when you can just use:

let v: u128 = ...;
let low_half = v as u64;
let high_half = (v >> 64) as u64;

and

let v: i128 = ...;
let low_half = v as u64; // note unsigned
let high_half = (v >> 64) as i64;

and then on the python side:

(high_half << 64) | low_half

@programmerjake
Copy link
Contributor

Good test values are:

let u128_test_value = 0xFF0102030405060788090A0B0C0D0E0Fu128;
let i128_test_value = u128_test_value as i128;

since it tests the halves are in the right order with the right signedness.

@davidhewitt
Copy link
Member

Why are you using all the le bytes methods when you can just use:

😅 great suggestion, thanks! I think when this implementation started with to_bytes we got stuck on one idea too much!

@kngwyu
Copy link
Member Author

kngwyu commented Dec 13, 2020

Why are you using all the le bytes methods when you can just use:

Wow, thanks. It seems like I wrote too many TeX documents to forget some bit hacks 🙄

@kngwyu
Copy link
Member Author

kngwyu commented Dec 13, 2020

Added a basic proptest but is this sufficient?

bytes.copy_from_slice(pybytes.as_bytes());
Ok(<$rust_type>::from_le_bytes(bytes))
-1 as _,
ffi::PyLong_AsUnsignedLongLongMask(ob.as_ptr()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd probably want an assertion of some sort that PyLong_UnsignedLongLongMask returns the expected type, since C/C++ don't guarantee that it's u64 -- it could be u128 or some other type on an unusual OS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm 🤔 , but even the Rust stdlib does not support it https://doc.rust-lang.org/nightly/std/os/raw/type.c_ulonglong.html.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs are just showing the type used on the system they used to build the docs (x86_64-unknown-linux-gnu iirc), it doesn't mean it will always be u64.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the description:

The C standard technically only requires that this type be an unsigned integer with the size of a long long, although in practice, no system would have a long long that is not a u64, as most systems do not have a standardised u128 type.

Also, it has no #[cfg block:
image
So I think we don't need to consider the system where ull is u128 for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, yeah. I did check both the AS/400 and RISC-V 128-bit compilers (or at least their ABI proposals/docs) and they both have a 64-bit ull -- AS/400 just doesn't have a 128-bit int type (no intptr_t at all), and RV128 proposes long long long for 128-bit types.

@programmerjake
Copy link
Contributor

You accidentally changed src/lib.rs to an executable file.

@programmerjake
Copy link
Contributor

Btw, sorry to sound all negative, thanks for all your work!

@programmerjake
Copy link
Contributor

other than all that, looks good to me!

@kngwyu
Copy link
Member Author

kngwyu commented Dec 13, 2020

Btw, sorry to sound all negative, thanks for all your work!

No worry, thank you for your comments 👍🏼

@davidhewitt davidhewitt mentioned this pull request Dec 14, 2020
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just need to chmod -x src/lib.rs before merging. Not sure how that happened 🤔

@kngwyu
Copy link
Member Author

kngwyu commented Dec 18, 2020

Thanks!

Not sure how that happened

Maybe my emacs setting does something wrong, but I'm also not sure 😓

@kngwyu kngwyu merged commit e64dc12 into master Dec 19, 2020
@messense messense deleted the abi3-128bit-integer branch March 18, 2021 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use shifting and masking to convert between 128-bit Rust integers and Python integers
4 participants