Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compression support: #3419

Closed
wants to merge 4 commits into from
Closed

Improve compression support: #3419

wants to merge 4 commits into from

Conversation

nbougalis
Copy link
Contributor

  • Optimize parsing of compressed message headers
  • Enforce protocol-defined message size maxima
  • Update comments

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few nits, the formatting issues need to be resolved before this is merged, but since all the other issues were nits I'm approving now. 👍

if (size < hdr.header_size)
return {};
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer a style that checks for errors and returns early as opposed for checking for success. That style tends to reduce nesting indentation levels. Most of the code in this function is written this way, but this block doesn't do that. It's a minor nit, so no change needed, but since I don't have many comments in this PR I thought I'd point it out.
I.e

    if ((*iter & 0xFC) != 0)
    {
        ec = make_error_code(boost::system::errc::no_message);
        return boost::none;
    }
    hdr.header_size = headerBytes;
   // ect.

src/ripple/overlay/impl/PeerImp.h Outdated Show resolved Hide resolved
src/ripple/overlay/impl/ProtocolMessage.h Outdated Show resolved Hide resolved
@@ -84,21 +84,14 @@ class compression_test : public beast::unit_test::suite
std::shared_ptr<T> proto,
protocol::MessageType mt,
uint16_t nbuffers,
const char* msg,
bool log = false)
std::string msg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const&

@seelabs
Copy link
Collaborator

seelabs commented Jun 1, 2020

These changes are non-trivial enough that we should probably assign another reviewer as well.

enum class Algorithm : std::uint8_t { None = 0x00, LZ4 = 0x01 };
// All values other than 'none' must have the high bit. The low order four bits
// must be 0.
enum class Algorithm : std::uint8_t { None = 0x00, LZ4 = 0x90 };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change from 0x01 to 0x90?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new framing for compressed messages alters the first byte of the header, and specifically the first four bits: If the high bit is set, it means the message is compressed. The remaining three bits specify the compression algorithm used.

0x90 is 1001000 in binary: high bit set (compression enabled) , low bit of high nibble set (LZ4).

While reading your comment, it occurred to me to check when we set this value, and I noticed that the code there is wrong. It does:

*h |= 0x80 | (static_cast<uint8_t>(comprAlgorithm) << 4);

Obviously not the right thing, this should now be: *h |= static_cast<std::uint8_t>(comprAlgorithm); instead. For the record, this was a bug that I introduced into this PR and not @gregtatcam.

I'll push a [FOLD] commit to address this.

Thanks @MarkusTeufelberger! Sharp eyes!

gregtatcam and others added 3 commits June 4, 2020 12:14
* Optimize parsing of compressed message headers
* Enforce protocol-defined message size maxima
* Update comments
src/ripple/overlay/impl/Message.cpp Outdated Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Show resolved Hide resolved
src/ripple/overlay/impl/ProtocolMessage.h Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Show resolved Hide resolved
Copy link
Contributor

@cjcobb23 cjcobb23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to click Request Changes, because I am going on PTO and I don't want to block the merge. But I strongly suggest adding a layer of abstraction for all of the bitwise operations in parseMessageHeader, per my comments.
Sorry for leaving two separate reviews. My comments were originally part of a resolved conversation, so I moved them.

* @param ec contains more information:
* - set to `errc::success` if not enough bytes were present
* - set to `errc::no_message` if a valid header was not present
*/
template <class BufferSequence>
boost::optional<MessageHeader>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to std::optional? I think we are trying to migrate away from boost::optional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. Shouldn't be a problem.

template <class BufferSequence>
boost::optional<MessageHeader>
parseMessageHeader(BufferSequence const& bufs, std::size_t size)
parseMessageHeader(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of putting the bitwise comparison inside of a named lambda, where the name describes what is being checked? I don't really like bitwise operations in a conditional directly, because it doesn't read that easily. It always takes me a second to figure out whats going on, and I think sometimes subtle bugs can occur. Remember this: #3417? At the least, I think we should add a comment saying exactly what is being checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, after reading through the whole function, I think we should abstract away these bitwise operations to a class, or a bunch of helper functions. Bitwise operations are pretty low level, and I think it's hard to read the code and easy to mess up. We could compromise on more judicious use of comments (I would put a clear comment before every bitwise operation), but I'd really prefer named functions; it's pretty easy for a developer to change the code and forget to update the comment, but it's harder to change the function implementation and forget to update the function name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear what you're saying, but I think that adding additional lambas would only add confusion. Better comments would probably help; I'll try and follow-up.

@carlhua carlhua requested a review from cjcobb23 June 22, 2020 14:32
@cjcobb23
Copy link
Contributor

Looks good to me

@carlhua carlhua added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Jun 24, 2020
@nbougalis nbougalis deleted the msghdr branch October 16, 2023 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants