Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-465 Introduce a bson::Uuid wrapper type #314

Merged
merged 17 commits into from
Oct 21, 2021

Conversation

patrickfreed
Copy link
Contributor

@patrickfreed patrickfreed commented Oct 19, 2021

RUST-465

This PR introduces a Uuid type which can be used to correctly serialize to / from BSON, regardless of whether bson::to_bson or bson::to_vec is used. This type requires no feature flags to be used, though the existing flag can be enabled to allow for easy conversion between it and uuid::Uuid (similar to bson::DateTime).

As part of this, I updated the UUID spec implementation to operate on the new bson::Uuid type rather than uuid::Uuid, and as a result, was able to move that out from behind a feature flag too. Since these methods haven't been included in a release yet, this isn't a breaking change.

When serialized/deserializing from other formats, I opted to have bson::Uuid use uuid::Uuid's implementation, rather than sticking with extended JSON (unlike bson::DateTime). Based on past experience, I think users will prefer this, but let me know if this doesn't seem like the right choice.

See #311 (comment) for more context on the motivation behind this change.

@@ -1076,148 +1073,6 @@ impl Binary {
}
}

/// Enum of the possible representations to use when converting between [Uuid](https://docs.rs/uuid/0.8.2/uuid/) and [`Binary`].
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was all moved to the uuid module. The main difference is that these methods now work with bson::Uuid instead of uuid::Uuid, and that they also can be used without a feature flag.

value: &T,
) -> crate::ser::Result<Bson>
where
T: Serialize,
{
value.serialize(self)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serialization works by serializing a newtype with a name that this serializer can recognize (UUID_NEWTYPE_NAME). When it sees this type, it uses the output of uuid::Uuid's Serialize implementation (for the human readable serializer, a string) and turns it into a subtype 4 Binary.

The same thing is done for the raw serializer, except that it expects uuid::Uuid to produce bytes.

@@ -396,146 +391,3 @@ fn debug_print() {
assert_eq!(format!("{:?}", doc), normal_print);
assert_eq!(format!("{:#?}", doc), pretty_print);
}

#[cfg(feature = "uuid-0_8")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were moved to src/uuid/test.rs

src/uuid/mod.rs Outdated
#[cfg(feature = "uuid-0_8")]
#[cfg_attr(docsrs, doc(cfg(feature = "uuid-0_8")))]
impl Uuid {
/// Convert this [`Uuid`] to a [`uuid::Uuid`] from the [`uuid`](https://docs.rs/uuid/latest) crate.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it would be confusing to call this from_uuid, so I just appended the version number to the end to disambiguate. This will also be useful for when uuid releases another minor version.

where
D: serde::Deserializer<'de>,
{
match deserializer.deserialize_newtype_struct(UUID_NEWTYPE_NAME, BsonVisitor)? {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deserialization works by signaling to the deserializer, again through the newtype name, that we're looking for a UUID. The BSON deserializers ensure that a binary subtype 4 is present in the data and then pass it on. Non-BSON deserializers will just attempt to deserialize a Bson normally, and from there we deserialize a uuid::Uuid manually.

}

#[test]
fn test_binary_constructors() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test and below were moved, the above ones are new.

@patrickfreed patrickfreed marked this pull request as ready for review October 19, 2021 23:21
Copy link
Contributor

@abr-egn abr-egn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Minor comment nit only.

src/uuid/mod.rs Outdated
//! # use serde::{Serialize, Deserialize};
//! # #[derive(Serialize, Deserialize)]
//! # struct Foo {
//! # /// serializes as a String or subtype 0 BSON binary, depending
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are somewhat confusing to include in this example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, those were copy/paste leftovers, removed.

Copy link
Contributor Author

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I noticed with the current approach to deserialization here is that our custom deserialization logic will be skipped if the Uuid is part of a #[serde(flatten)] chain (see serde-rs/serde#2106). This means that Uuids will erroneously allow deserialization from Binary values with subtype 0 in those cases. The driver currently deserializes all responses using #[serde(flatten)], so in practice this means this behavior will affect most users.

We could fix this issue by using the extended JSON format like we do for ObjectId and DateTime instead of the newtype pattern, but I think using uuid::Uuid's serde implementation in non-BSON contexts is a valuable enough of a feature to warrant living with slightly too permissive deserialization in BSON. Note that this issue doesn't affect serialization, only deserialization. Thoughts?

src/uuid/mod.rs Outdated
//! # use serde::{Serialize, Deserialize};
//! # #[derive(Serialize, Deserialize)]
//! # struct Foo {
//! # /// serializes as a String or subtype 0 BSON binary, depending
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, those were copy/paste leftovers, removed.

@abr-egn
Copy link
Contributor

abr-egn commented Oct 21, 2021

One issue I noticed with the current approach to deserialization here is that our custom deserialization logic will be skipped if the Uuid is part of a #[serde(flatten)] chain (see serde-rs/serde#2106). This means that Uuids will erroneously allow deserialization from Binary values with subtype 0 in those cases. The driver currently deserializes all responses using #[serde(flatten)], so in practice this means this behavior will affect most users.

We could fix this issue by using the extended JSON format like we do for ObjectId and DateTime instead of the newtype pattern, but I think using uuid::Uuid's serde implementation in non-BSON contexts is a valuable enough of a feature to warrant living with slightly too permissive deserialization in BSON. Note that this issue doesn't affect serialization, only deserialization. Thoughts?

I agree, since it's permissive deserialization rather than broken serialization it seems fine to accept as trade-off for the other utility here. Might want to document that it's accidental and unsupported so that if serde fixes the bug it (hopefully) doesn't break user code.

src/uuid/mod.rs Outdated
#[cfg(feature = "uuid-0_8")]
#[cfg_attr(docsrs, doc(cfg(feature = "uuid-0_8")))]
impl Uuid {
/// Create a [`Uuid`] from a [`uuid::Uuid`](https://docs.rs/uuid/latest/uuid/struct.Uuid.html) from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these methods in the API in addition to the From impls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While they are technically redundant, I think they're useful for visibility. This is also consitent with our to_chrono and from_chrono methods on bson::DateTime.

src/uuid/mod.rs Outdated
}

impl Binary {
/// Serializes a [Uuid](https://docs.rs/uuid/0.8.2/uuid/) into BSON [`Binary`] type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this link the uuid::Uuid type since it's returning a crate::Uuid? Ditto the other methods below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed.

src/uuid/mod.rs Outdated Show resolved Hide resolved
@isabelatkinson
Copy link
Contributor

I agree, since it's permissive deserialization rather than broken serialization it seems fine to accept as trade-off for the other utility here. Might want to document that it's accidental and unsupported so that if serde fixes the bug it (hopefully) doesn't break user code.

Agreed

patrickfreed and others added 3 commits October 21, 2021 13:37
Co-authored-by: Isabel Atkinson <isabelatkinson@gmail.com>
@patrickfreed
Copy link
Contributor Author

Might want to document that it's accidental and unsupported so that if serde fixes the bug it (hopefully) doesn't break user code.

Good idea, done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants