Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3892: [Rust] Support to resolve fixed from bytes and deserialize bytes in deserialize_any #2567

Merged
merged 7 commits into from
Oct 26, 2023

Conversation

ZENOTME
Copy link
Contributor

@ZENOTME ZENOTME commented Oct 24, 2023

What is the purpose of the change

(For example: This pull request improves file read performance by buffering data, fixing AVRO-XXXX.)

fixing AVRO-3892

This PR:

  1. support to resolve fixed from bytes
  2. support to deserialize bytes, fixed, decimal in deserialize_any

Verifying this change

  1. For resolving fixed from bytes, I have add related test.
  2. For deserializing bytes, fixed, decimal in deserialize_any, seems it don't have test for deserialize_any. If it required, we can consider to add test for whole deserialize_any. And actually I have test it in feat: support ser/deser of value  iceberg-rust#82 locally.

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added the Rust label Oct 24, 2023
@@ -350,6 +354,7 @@ impl<'a, 'de> de::Deserializer<'de> for &'a Deserializer<'de> {
Value::String(ref s) => visitor.visit_bytes(s.as_bytes()),
Value::Bytes(ref bytes) | Value::Fixed(_, ref bytes) => visitor.visit_bytes(bytes),
Value::Uuid(ref u) => visitor.visit_bytes(u.as_bytes()),
Value::Decimal(ref d) => visitor.visit_bytes(&d.to_vec()?),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to add new unit tests for the changes above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review! I have added the test for it.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@@ -350,6 +354,7 @@ impl<'a, 'de> de::Deserializer<'de> for &'a Deserializer<'de> {
Value::String(ref s) => visitor.visit_bytes(s.as_bytes()),
Value::Bytes(ref bytes) | Value::Fixed(_, ref bytes) => visitor.visit_bytes(bytes),
Value::Uuid(ref u) => visitor.visit_bytes(u.as_bytes()),
Value::Decimal(ref d) => visitor.visit_bytes(&d.to_vec()?),
_ => Err(de::Error::custom(format!(
"Expected a String|Bytes|Fixed|Uuid, but got {:?}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Expected a String|Bytes|Fixed|Uuid, but got {:?}",
"Expected a String|Bytes|Fixed|Uuid|Decimal, but got {:?}",

martin-g and others added 2 commits October 25, 2023 10:16
…alue::Bytes

The tests are not really related to AVRO-3892. They do not cover the new
changes in deserialize_any()

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
#[test]
fn test_avro_3892_deserialize_bytes_from_uuid() -> TestResult {
let uuid_str = "10101010-2020-2020-2020-101010101010";
let expected_bytes = Uuid::parse_str(uuid_str)?.as_bytes().to_vec();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing may need to notice: I find that the behaviour of deserialize_str and deserialize_bytes for Value::Uuid is different. We can see that in deserialize_bytes for Value::Uuid, the bytes representation is Uuid::parse_str(uuid_str)?.as_bytes() rathern than uuid_str.as_bytes(). But for deserialize_str, the result will be uuid_str.as_bytes() I think. Is this behaviour is expect? cc @martin-g

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno!
I have to check what Uuid::as_bytes() does. I'll check tomorrow!
But feel free to propose a PR with a fix if you think it is wrong!


#[test]
fn test_avro_3892_deserialize_bytes_from_decimal() -> TestResult {
let expeced_bytes = BigInt::from(123456789).to_signed_bytes_be();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let expeced_bytes = BigInt::from(123456789).to_signed_bytes_be();
let expected_bytes = BigInt::from(123456789).to_signed_bytes_be();

lang/rust/avro/src/de.rs Show resolved Hide resolved
#[test]
fn test_avro_3892_deserialize_bytes_from_uuid() -> TestResult {
let uuid_str = "10101010-2020-2020-2020-101010101010";
let expected_bytes = Uuid::parse_str(uuid_str)?.as_bytes().to_vec();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno!
I have to check what Uuid::as_bytes() does. I'll check tomorrow!
But feel free to propose a PR with a fix if you think it is wrong!

@martin-g martin-g merged commit 8073145 into apache:main Oct 26, 2023
15 checks passed
@martin-g
Copy link
Member

Thank you, @ZENOTME !

martin-g pushed a commit that referenced this pull request Oct 26, 2023
… bytes in deserialize_any (#2567)

* support to resolve fixed from bytes

* support to deserialize bytes, fixed, decimal.

* fix clippy

* AVRO-3892: Rename test method

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* AVRO-3892: [Rust] Add unit tests for deserializing &str/String from Value::Bytes

The tests are not really related to AVRO-3892. They do not cover the new
changes in deserialize_any()

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* add unit test for deserialize bytes from decimal and uuid

* add more test

---------

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Co-authored-by: ZENOTME <st810918843@gmail.com>
Co-authored-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
(cherry picked from commit 8073145)
@ZENOTME ZENOTME deleted the bytes branch October 27, 2023 01:10
RanbirK pushed a commit to RanbirK/avro that referenced this pull request May 13, 2024
… bytes in deserialize_any (apache#2567)

* support to resolve fixed from bytes

* support to deserialize bytes, fixed, decimal.

* fix clippy

* AVRO-3892: Rename test method

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* AVRO-3892: [Rust] Add unit tests for deserializing &str/String from Value::Bytes

The tests are not really related to AVRO-3892. They do not cover the new
changes in deserialize_any()

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* add unit test for deserialize bytes from decimal and uuid

* add more test

---------

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Co-authored-by: ZENOTME <st810918843@gmail.com>
Co-authored-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants