-
Notifications
You must be signed in to change notification settings - Fork 763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string: implement API to access raw string data #1794
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks useful to me! I wonder whether eventually we'd use this internally in PyO3 too to potentially avoid causing strings to generate (and cache) their UTF-8 representation.
I have a bunch of small nitty comments below.
I'd also very much like to see some more tests of PyStringData::to_string()
- it would be good to cover both the success case (which looks like it's covered) and the error cases (which don't appear to be) for each of the three character widths.
Ucs4(&'a [u32]), | ||
} | ||
|
||
impl<'a> PyStringData<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it doesn't look like any of these methods need to refer to 'a
, I think you can just do:
impl<'a> PyStringData<'a> { | |
impl PyStringData<'_> { |
Also, |
OK. I think the most recent push addresses most (all?) of your feedback. I did disagree with you on exposing I added tests for verifying As part of writing the tests, I noticed there may be a |
14ed542
to
83bf70b
Compare
e39c167
to
364d61f
Compare
With the recent implementation of non-limited unicode APIs, we're able to query Python's low-level state to access the raw bytes that Python is using to store string objects. This commit implements a safe Rust API for obtaining a view into Python's internals and representing the raw bytes Python is using to store strings. Not only do we allow accessing what Python has stored internally, but we also support coercing this data to a `Cow<str>`. Closes PyO3#1776.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, this looks good to me!
With the recent implementation of non-limited unicode APIs, we're
able to query Python's low-level state to access the raw bytes that
Python is using to store string objects.
This commit implements a safe Rust API for obtaining a view into
Python's internals and representing the raw bytes Python is using
to store strings.
Not only do we allow accessing what Python has stored internally,
but we also support coercing this data to a
Cow<str>
.Closes #1776.