Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++] Add support for boolean types in flexbuffers #4386

Merged
merged 8 commits into from
Aug 4, 2017

Conversation

rouzier
Copy link
Contributor

@rouzier rouzier commented Jul 20, 2017

No description provided.

@rouzier rouzier force-pushed the feature/flexbuffer_bool branch 2 times, most recently from e828a24 to b6ef1b5 Compare July 20, 2017 03:30
@aardappel
Copy link
Collaborator

Thanks!

Not sure how I feel about this, as so far we intended for booleans to be the same as ints, to keep things simple. I.e. FlexBuffers stores things as they are represented, not what they mean.

Then again, we already have an explicit null too, and maybe it would help interop with JSON to not lose "true" and "false".

Anyone have an opinion?

@rouzier
Copy link
Contributor Author

rouzier commented Jul 20, 2017

Thanks for considering my pull request.

The reason for the pull request is to have better compatibility with JSON and not lose "true" and "false".

@aardappel
Copy link
Collaborator

I agree that would be nice. Let me have a look..

@rouzier rouzier force-pushed the feature/flexbuffer_bool branch from b6ef1b5 to 02c6857 Compare July 21, 2017 15:23
Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.. looks pretty good!

@@ -377,12 +379,44 @@ class Reference {
case TYPE_NULL: return 0;
case TYPE_STRING: return flatbuffers::StringToInt(AsString().c_str());
case TYPE_VECTOR: return static_cast<int64_t>(AsVector().size());
case TYPE_BOOL: return static_cast<int64_t>(AsBool());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call ReadInt64 here?

default:
// Convert other things to int.
return 0;
}
}

bool AsBool() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this behind all the AsInt functions?

default:
// Convert other things to int.
return 0;
}
}

bool AsBool() const {
if (type_ == TYPE_BOOL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The body could be if (type_ == TYPE_STRING) { .. } else return AsInt64() != 0

Actually I'm not sure how I feel about this string special-case.. if the JSON contains "true" as a string, it be better to serialize it as a TYPE_BOOL, and then we wouldn't need this code here at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue I see with that you can lose semantic meaning.
A very contrived example.

{ 
  "question" : "Is flex awesome", 
  "answer"   : "true", 
  "choices"  : ["true", "false", "unsure"]
}

Someone would expect the type of answer to always be a string by converting a string during parsing to a boolean you that meaning.

So which direction would you want me to go?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry for the late reply, somehow missed this)
Yeah, this is indeed tricky. I guess one could argue that if someone writes "true" instead of true then maybe that is what they want, or if they didn't, then the fact that they get costlier string storage is their own fault.
If we go that way though, you could also argue that a string should always be false, regardless of wether it contains "true", since a string is not a boolean. Or maybe, "" should be false and all other strings true, analogous to how it converts strings/vectors to int. That is already implied by return AsInt64() != 0. I guess I don't like this string special case.
If were to go with having the parser convert "true" to a boolean value, then ToString would get the correct string back, but AsString wouldn't. So yeah, maybe lets not go with that.
I'm kind of in favour of the simplest solution here, meaning no special case at all.

@@ -404,6 +438,7 @@ class Reference {
case TYPE_NULL: return 0;
case TYPE_STRING: return flatbuffers::StringToUInt(AsString().c_str());
case TYPE_VECTOR: return static_cast<uint64_t>(AsVector().size());
case TYPE_BOOL: return static_cast<uint64_t>(AsBool());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again just call ReadInt64(data_, parent_width_) directly.

@@ -431,6 +466,7 @@ class Reference {
case TYPE_NULL: return 0.0;
case TYPE_STRING: return strtod(AsString().c_str(), nullptr);
case TYPE_VECTOR: return static_cast<double>(AsVector().size());
case TYPE_BOOL: return static_cast<double>(AsBool());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadInt64(data_, parent_width_)

@@ -1194,6 +1239,8 @@ class Builder FLATBUFFERS_FINAL_CLASS {

Value() : i_(0), type_(TYPE_NULL), min_bit_width_(BIT_WIDTH_8) {}

Value(bool b) : u_(b ? 1 : 0), type_(TYPE_BOOL), min_bit_width_(BIT_WIDTH_8) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just static_cast<int64_t>(b)

@@ -420,12 +420,13 @@ struct IDLOptions {

// This encapsulates where the parser is in the current source file.
struct ParserState {
ParserState() : cursor_(nullptr), line_(1), token_(-1) {}
ParserState() : cursor_(nullptr), line_(1), token_(-1), is_bool_(false) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit of a hack :)
The better way is to introduce a kTokenBooleanConstant, and to check the uses of kTokenIntegerConstant to also deal with this new token (in some cases, not all)

@@ -391,6 +392,7 @@ CheckedError Parser::Next() {
if (attribute_ == "true" || attribute_ == "false") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in ParseFlexBufferValue, you should check for strings true/false

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed above, lets not do this.

tests/test.cpp Outdated
TEST_EQ(vec[0].AsInt64(), -100);
TEST_EQ_STR(vec[1].AsString().c_str(), "Fred");
TEST_EQ(vec[1].AsInt64(), 0); // Number parsing failed.
TEST_EQ(vec[2].AsDouble(), 4.0);
TEST_EQ(vec[2].AsString().IsTheEmptyString(), true); // Wrong Type.
TEST_EQ_STR(vec[2].AsString().c_str(), ""); // This still works though.
TEST_EQ_STR(vec[2].ToString().c_str(), "4.0"); // Or have it converted.
TEST_EQ(vec[3].IsBool(), true); // This is a boolean
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comment don't explain anything extra.. same below.

@rouzier rouzier force-pushed the feature/flexbuffer_bool branch 3 times, most recently from 8557909 to 2a7f493 Compare July 28, 2017 03:13
tests/test.cpp Outdated
slb.Double("foo", 100);
slb.Map("mymap", [&]() {
slb.String("foo", "Fred"); // Testing key and string reuse.
slb.String("sbool1", "true");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests can be removed if we're not supporting booleans in strings

@rouzier rouzier force-pushed the feature/flexbuffer_bool branch from 070183f to 5754ff0 Compare July 29, 2017 05:32
@rouzier
Copy link
Contributor Author

rouzier commented Aug 2, 2017

Updated everything let me know if there is anything else needed.

Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! just a little code simplification..

@@ -363,6 +365,13 @@ class Reference {
bool IsMap() const { return type_ == TYPE_MAP; }
bool IsBlob() const { return type_ == TYPE_BLOB; }

bool AsBool() const {
if (type_ == TYPE_BOOL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return (type_ == TYPE_BOOL ? .. : ..) != 0

@@ -588,6 +603,13 @@ class Reference {
}
}

bool MutateBool(bool b) {
if (type_ == TYPE_BOOL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type_ == TYPE_BOOL && ..

@rouzier
Copy link
Contributor Author

rouzier commented Aug 4, 2017

Done!

@aardappel
Copy link
Collaborator

Thanks, nice improvement!

@aardappel aardappel merged commit a2b1bfc into google:master Aug 4, 2017
@ghost
Copy link

ghost commented Aug 10, 2017

@rouzier: @mzaks notes that this PR may not be complete if doesn't test a vector of bools.. he notes that additional changes in e.g. IsTypedVectorElementType were necessary.

@rouzier
Copy link
Contributor Author

rouzier commented Aug 10, 2017

That would also require to add a new type TYPE_VECTOR_BOOL.
Is that something that you think should be done for completeness.
If so I will create a pull request over the weekend.

@ghost
Copy link

ghost commented Aug 10, 2017

That is not strictly necessary (you can just use TYPE_VECTOR, but it would be nice to have also, yes, to not store all the type fields.

@rouzier
Copy link
Contributor Author

rouzier commented Aug 10, 2017

Since there is no TYPE_VECTOR_BOOL there is no need to change IsTypedVectorElementType.
Since IsTypedVectorElementType(flexbuffers::TYPE_BOOL) is false at the moment.

Unless I misunderstand the purpose of the function.

Which I assume is to determine if a type can be a turned into a typed vector.

@ghost
Copy link

ghost commented Aug 10, 2017

That is correct, it would only need to change if TYPE_VECTOR_BOOL were added.

@mzaks
Copy link
Contributor

mzaks commented Aug 10, 2017

Well this in not what I meant :)
TYPE_VECTOR_BOOL would be great but I would expect it to be a bitfield than.
What I mean is a typed vector which contains bool values which are internally stored as uint.
A bool is now stored as 1byte uint value.
If you would like to add a scalar vector of bools. Like with this method:
https://github.com/google/flatbuffers/blob/master/include/flatbuffers/flexbuffers.h#L1343
Before you would create a typed vector which contains a vector of uint8 values which you could also access with asBool.
Now this is impossible because of IsTypedVectorElementType & ToTypedVector and maybe a few more places.
Here is my commit where I introduced Bool type in FlexBuffers, maybe it helps:
mzaks/FlexBuffersSwift@509db06
So I would expect that I can still save a typed bool vector which is internally stored as uint8. And Have a complete different more bit packed version for TYPE_VECTOR_BOOL.
Well actually if TYPE_VECTOR_BOOL can be introduced than we can skip what I did in FlexBuffersSwift, but I guess a proper TYPE_VECTOR_BOOL is much more complex. So I went with an intermediate solution.
I guess for now you can just check if you can create a vector of bools like this: fbb.Vector(true, false, true); I could not in Swift and I needed to because I already use such construct in production.

@ghost
Copy link

ghost commented Aug 10, 2017

Ok, I'm a bit confused about what you're suggesting we do.

You're saying you want both a typed vector of bool (that's store as a uint8) and a bitfield?

Currently there's no typed vector of bool, so I am not sure what the problem is with IsTypedVectorElementType.

As for bitfields, I would not call them TYPE_VECTOR_BOOL since they will be entirely incompatible with other typed vector. They're better off as TYPE_BITFIELD or something. TYPE_VECTOR_BOOL could be used for a typed vector of non-bit bools, but having both may be overkill, especially since you can already use bools with TYPE_VECTOR if you want. So maybe adding TYPE_BITFIELD is sufficient.

Also, your asBool appears incorrect, it should return true for everything !=0, not nil.

@mzaks
Copy link
Contributor

mzaks commented Aug 10, 2017

Yeah sorry my answer was not really structured. I would suggest following:
Check if it was possible to create a scalar bool vector through an API like fbb.Vector(true, false); or fbb.Add({true, false}); here I am referring to
https://github.com/google/flatbuffers/blob/master/include/flatbuffers/flexbuffers.h#L1135
Not sure if I got the syntax correct, sorry.
If those API calls worked before they should work now because otherwise you are breaking the API for people who might used it.
This API broke for me in FlexBuffersSwift, maybe because I enforce the vector to be typed if user passes in array of booleans in Swift [Bool]. It might be that in the C++ implementation you will just fall back to untyped vector which than would be not ideal but good enough solution.
As it is defined here
https://github.com/google/flatbuffers/blob/master/include/flatbuffers/flexbuffers.h#L1260
Value holding boolean is a Value holding uint64_t meaning that TYPE_VECTOR_BOOL is equivalent to TYPE_VECTOR_UINT.

So I change isTypedVectorElement to make sure that a I get a typed vector if all elements in the vector are bools.

Now here is the part which confuses me my self when I am looking at it again :)
I did not introduce TYPE_VECTOR_BOOL I changed isTypedVectorElement to return true if self type is bool and I changed toTypedVector to return .vector_uintif self is bool. It works as all my tests are passing, but now I am a bit paranoid that I broke / am missing something.

@mzaks
Copy link
Contributor

mzaks commented Aug 10, 2017

Looking at FlexBuffersTest just add a vector of bools to the example, like following

{ vec: [ -100, "Fred", 4.0, false ], bar: [ 1, 2, 3 ], bar3: [ 1, 2, 3 ], foo: 100, bool: true, bools: [true, false, true], mymap: { foo: "Fred" } }

See if it decodes from JSON and than encodes to proper JSON again.
I guess it should be fine.

@ghost
Copy link

ghost commented Aug 11, 2017

fbb.Vector with a std::vector would currently fail, because the code assumes all scalars are available as a typed vector, which bools currently are not. So this would be a good reason to add TYPE_VECTOR_BOOL with uint8_t (not bit) elements, for consistency.

I don't think vectors of bools should become typed vectors of uint if we have an actual bool type. We should definitely have a typed version, since paying 2 bytes per bool would be a bit wasteful.

So my recommendation, for now, is to fix things by adding a TYPE_VECTOR_BOOL. We can add a TYPE_BITFIELD additionally later, but that can/should be in its own PR.

@mzaks
Copy link
Contributor

mzaks commented Aug 13, 2017

Cool. Will introduce TYPE_VECTOR_BOOL in FlexBuffersSwift, when the PR (#4410) is through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants