-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUST-1553 Add support for document sequences (OP_MSG
payload type 1)
#1009
RUST-1553 Add support for document sequences (OP_MSG
payload type 1)
#1009
Conversation
|
||
let cmd_name = cmd.name.clone(); | ||
let target_db = cmd.target_db.clone(); | ||
|
||
let serialized = op.serialize_command(cmd)?; | ||
#[allow(unused_mut)] | ||
let mut message = Message::from_command(cmd, Some(request_id))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I eliminated the RawCommand
layer between Command
and Message
here. RawCommand
contained a vec of bytes created by Operation::serialize_command
with the assumption that those bytes should be sent as a single payload type 0 section when converting to Message
: this no longer makes sense when commands could require multiple sections.
src/cmap/conn/wire/message.rs
Outdated
command: Command<T>, | ||
request_id: Option<i32>, | ||
) -> Result<Self> { | ||
let mut document_payload = bson::to_raw_document_buf(&command.body)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will re-serialize command bodies that were specified as RawDocumentBuf
s as RawDocumentBuf
s which isn't great, but all of those payloads should be fairly small since we're now using document sequences for big commands (i.e. inserts and bulk writes). There wasn't a great way to special-case raw docs here without overhead/code duplication, but we can certainly go in that direction if this turns out to be a perf hit (which I think is very unlikely).
Eventually I would like to remove T
from Command
entirely and use RawDocumentBuf
as the type for body
. In most places we're just using Document
or RawDocumentBuf
for T
anyway, and it should be fairly simple to replace the structs we use to model some commands with documents. I will file a ticket for this if it sounds good to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strongly agree, I think that would be a great simplification here. And also agreed that the double serialization is unlikely to be an issue in practice.
// OP_MSG payload type 0 | ||
pub(crate) document_payload: RawDocumentBuf, | ||
// OP_MSG payload type 1 | ||
pub(crate) document_sequences: Vec<DocumentSequence>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design models the OP_MSG structure more closely (i.e. exactly one payload type 0 section, any number of payload type 1 sections). I did keep the MessageSection
enum for reading responses since sections can be specified in any order. AFAIK the server never actually returns anything as payload type 1 but it doesn't hurt to keep around that logic for future use.
@@ -151,7 +151,7 @@ pub struct InsertManyOptions { | |||
pub ordered: Option<bool>, | |||
|
|||
/// The write concern for the operation. | |||
#[serde(skip_deserializing)] | |||
#[serde(skip_deserializing, skip_serializing_if = "write_concern_is_empty")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tangential change that came up when I was refactoring the Insert
op implementation. We currently have a macro to remove empty write concerns from options, but this strategy feels a little cleaner to me. I'll file a ticket to port everything over to this if you agree or I can revert if you prefer the macro approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
// enabled. | ||
const MAX_ENCRYPTED_WRITE_SIZE: u64 = 2_097_152; | ||
// The amount of overhead bytes to account for when building a document sequence. | ||
const COMMAND_OVERHEAD_SIZE: u64 = 16_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context here:
- A single BSON object (i.e. document) can be
maxBsonObjectSize
bytes, which is returned from the server (currently 16 MB). We previously used this number to cap the size of thedocuments
array for inserts without worrying about the additional info in the command sent to the server. This is because the server allows an extra 16 KB in a document for overhead (see here). - An OP_MSG can be
maxMessageSizeBytes
bytes total (also returned from the server, currently 48 MB without any overhead allowance). When building a document sequence (which is not constrained bymaxBsonObjectSize
) we can use this as an upper bound for how many bytes we can fit into the payload, but we still need to account for some overhead space for the command and OP_MSG metadata. I borrowed the 16 KB number from the server's overhead allocation and usemaxMessageSizeBytes - COMMAND_OVERHEAD_SIZE
to cap the size of a document sequence payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! One clarification question only, no need for re-review.
src/cmap/conn/wire/message.rs
Outdated
command: Command<T>, | ||
request_id: Option<i32>, | ||
) -> Result<Self> { | ||
let mut document_payload = bson::to_raw_document_buf(&command.body)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strongly agree, I think that would be a great simplification here. And also agreed that the double serialization is unlikely to be an issue in practice.
src/cmap/conn/command.rs
Outdated
#[serde(skip)] | ||
pub(crate) name: String, | ||
|
||
#[serde(skip)] | ||
pub(crate) exhaust_allowed: bool, | ||
|
||
#[serde(flatten)] | ||
#[serde(skip)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT the new behavior (skip
and extend_raw_document_buf
in Message::from_command
) should be equivalent to flatten
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah you're right, reverted
@@ -151,7 +151,7 @@ pub struct InsertManyOptions { | |||
pub ordered: Option<bool>, | |||
|
|||
/// The write concern for the operation. | |||
#[serde(skip_deserializing)] | |||
#[serde(skip_deserializing, skip_serializing_if = "write_concern_is_empty")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
OP_MSG
payload type 1)
Adds support for building commands with document sequences (OP_MSG payload type 1) and updates our
Insert
operation to specify the list of documents provided as a document sequence. See the OP_MSG spec for more details.