Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new trait to encode values to bytes #274

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

antonilol
Copy link
Contributor

@antonilol antonilol commented Aug 17, 2024

still lots of TODOs

Pull Request

Related issues

#165, #257

What does this PR do?

Introduces a new trait that allows more efficient implementations of encoding values to bytes by allowing users to return any AsRef<[u8]> + Into<Vec<u8>> type, possibly borrowed.

TODOs

  • resolve TODOs in the code
  • fix tests
  • fix remaining deprecation warnings (first I only fixed the errors so the code compiles again)
  • fix cookbook

@antonilol antonilol force-pushed the minimal-alloc-encoding branch from 4c13edb to 98c1031 Compare August 17, 2024 14:07
@Kerollmops
Copy link
Member

Hey @antonilol 👋

Thank you again for your help and PR. I am wondering if we could follow a little more #257 and provide a "writer" (maybe start with a Vec<u8>) to experiment. So that we can return either owned or borrowed data but the writer could directly be a buffer provided by LMDB, cf: Database::put_reserved.

Would it be possible to:

@antonilol
Copy link
Contributor Author

I am wondering if we could follow a little more #257 and provide a "writer" (maybe start with a Vec<u8>) to experiment. So that we can return either owned or borrowed data but the writer could directly be a buffer provided by LMDB, cf: Database::put_reserved.

This is certainly possible, what I have in mind right now is a method that by default forwards to to_bytes, but can be optionally implemented if desired.

Would it be possible to:

With ZERO_COPY associated const and size hint function?

  • Directly remove the previous trait (not deprecate it)

I sure can, with the original names (BytesEncode, EItem, bytes_encode)?

@Kerollmops
Copy link
Member

With ZERO_COPY associated const and size hint function?

No, with the zero_copy method. Not the const one.

I sure can, with the original names (BytesEncode, EItem, bytes_encode)?

Yup, would be perfect.

@antonilol
Copy link
Contributor Author

antonilol commented Aug 17, 2024

No, with the zero_copy method. Not the const one.

Do you maybe mean the encode_writer method? Right now I have BytesEncode::bytes_encode_into_writer which is like it, it uses &mut Vec<u8> instead of &mut impl Write (for now because of error handling). I see no zero_copy method in that linked issue.

@antonilol antonilol force-pushed the minimal-alloc-encoding branch 3 times, most recently from bee195f to a5c96f1 Compare August 18, 2024 17:57
@antonilol
Copy link
Contributor Author

In a5c96f1 I used Either (a generic 2 value enum) to return different error types without using dynamic dispatch (dyn StdError), this could also solve the issue in #166 with the function that decodes 2 different types and thus needs to handle 2 different errors.

A new Either like 2 value enum instead of Either will work too if the dependency is an issue.

@Kerollmops
Copy link
Member

In a5c96f1 I used Either (a generic 2 value enum) to return different error types without using dynamic dispatch (dyn StdError), this could also solve the issue in #166 with the function that decodes 2 different types and thus needs to handle 2 different errors.

A new Either like 2 value enum instead of Either will work too if the dependency is an issue.

Thank you for that, but I would prefer we keep these PR changes only for trait improvement. So that it's easier to review as less code changes. I'm also not a huge fan of returning a Result<Cow<[u8]>, Either<Self::Error, io::Error>>, but let's not discuss this here but on #165 (or #166).

Do you maybe mean the encode_writer method? Right now I have BytesEncode::bytes_encode_into_writer which is like it, it uses &mut Vec<u8> instead of &mut impl Write (for now because of error handling). I see no zero_copy method in that linked issue.

I was talking about the zero_copy method I am talking about in my other comment.

@antonilol
Copy link
Contributor Author

Thank you for that, but I would prefer we keep these PR changes only for trait improvement. So that it's easier to review as less code changes. I'm also not a huge fan of returning a Result<Cow<[u8]>, Either<Self::Error, io::Error>>, but let's not discuss this here but on #165 (or #166).

When accepting std::io::Write implementation to write to errors need to be handled (returned) in some way. I can revert it back to using &mut Vec<u8> instead, for which extend_from_slice can be used (will never return errors) instead of Write::write_all, or leave out bytes_encode_into_writer completely.

I was talking about the zero_copy method I am talking about in my other comment.

Do you mean dynamically choosing to borrow or return an owned type? This is still possible: to implement the trait the return value (ReturnBytes) need to implement Into<Vec<u8>> and AsRef<[u8]>, which Cow<[u8]> does. No functionality is lost with this new trait.

@Kerollmops
Copy link
Member

When accepting std::io::Write implementation to write to errors need to be handled (returned) in some way. I can revert it back to using &mut Vec<u8> instead, for which extend_from_slice can be used (will never return errors) instead of Write::write_all, or leave out bytes_encode_into_writer completely.

Can you Box the error for now? We will figure that out later when we have a good trait.

Do you mean dynamically choosing to borrow or return an owned type? This is still possible: to implement the trait the return value (ReturnBytes) need to implement Into<Vec<u8>> and AsRef<[u8]>, which Cow<[u8]> does. No functionality is lost with this new trait.

In issue #257, we are talking about a BytesEncode::ZERO_COPY associated const type that is a boolean, allowing optimizations by avoiding allocating a vector. However, I proposed changing this const to a method instead BytesEncode::zero_copy so that we can dynamically decide whether this value can be used or must be serialized into a buffer.

@antonilol
Copy link
Contributor Author

Can you Box the error for now? We will figure that out later when we have a good trait.

Sure, done

In issue #257, we are talking about a BytesEncode::ZERO_COPY associated const type that is a boolean, allowing optimizations by avoiding allocating a vector. However, I proposed changing this const to a method instead BytesEncode::zero_copy so that we can dynamically decide whether this value can be used or must be serialized into a buffer.

Now I get it, thanks for the patience in explaining it :), added the method to the trait

@antonilol antonilol force-pushed the minimal-alloc-encoding branch from 4536a2f to 3519c51 Compare August 20, 2024 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants