Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added string check #142

Merged
merged 3 commits into from Aug 13, 2018
Merged

added string check #142

merged 3 commits into from Aug 13, 2018

Conversation

ghost
Copy link

@ghost ghost commented Aug 4, 2018

Change checking utf8 string to 1.5.3 UTF-8 encoded strings specification from empty implementation.

added string check
@redboltz redboltz self-requested a review August 4, 2018 22:22
Copy link
Owner

@redboltz redboltz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that thre original code doesn't call is_valid_length(), 'is_valid_contents()(removed by the PR), andvalidate_contents()(introduced by the PR). We need to add them. I think that it is better to merge the PR before those addition. However,validate_contents()` is pretty complicated due to UTF8 spec. So unit test is required.
Could you add tests?

I think that MQTT_USE_STR_CHECK could be defined in the test code.

validate_contents(std::string const& str) {
// This code is based on https://www.cl.cam.ac.uk/~mgk25/ucs/utf8_check.c
auto result = validation::well_formed;
#if !defined(MQTT_USE_STR_CK)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that CK is short form of CHECK. I considered CK, CHK, and CHECK. CHECK is better here.
Could you replace it?

auto result = validation::well_formed;
#if !defined(MQTT_USE_STR_CK)
static_cast<void>(str);
#else
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If #else exists, #if condition should be positive expression.
See https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/endpoint.hpp#L4077

In addition, could you add comments at #else and #endif ?

auto s = reinterpret_cast<const unsigned char*>(str.c_str());
auto end = s + str.size();

while (s < end) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case s > end? If there is no case s != end is enough.

#if !defined(MQTT_USE_STR_CK)
static_cast<void>(str);
#else
auto s = reinterpret_cast<const unsigned char*>(str.c_str());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace const unsigned char* with unsigned char const*?

if ((s[0] >= 0x01 && s[0] <= 0x1f) || s[0] == 0x7f) {
result = validation::well_formed_with_non_charactor;
}
s++;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace with ++s?


while (s < end) {
if (s[0] < 0x80) {
// 0xxxxxxxxx
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some comments that describe binary digit. But code is hex digit. You can use 0b10000000 if the code is clearer.

Added utf8 test code.
Change the macro to MQTT_USE_STR_CHECK from MQTT_USE_STR_CK .
Some bugfix and refactoring.
@redboltz redboltz merged commit 6c955db into redboltz:master Aug 13, 2018
@redboltz
Copy link
Owner

Thank you for updating, merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant