-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support step definitions with multi-byte characters #224
Support step definitions with multi-byte characters #224
Conversation
1824e1a
to
b0ba33b
Compare
d846b42
to
73007d1
Compare
73007d1
to
b7c3a96
Compare
b7c3a96
to
135d9c3
Compare
json_spirit's escaping of multibyte characters creates bugs in the WireProtocol which prevent usage of valid UTF-8 encoded characters in step definitions relying on RegEx. The new tests: /features/specific/wire_encoding.feature /tests/integration/WireProtocolTest.cpp /tests/unit/RegexTest.cpp
135d9c3
to
c1f67b8
Compare
RFC @muggenhor & @paoloambrosio. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great contribution! This should fix #40.
On top of the other comments, can you remove the executable flag? All json_spirit
files went from 644 to 755.
a35ec16
to
ae6c54e
Compare
This change updates json-spirit to the latest public version: https://www.codeproject.com/KB/recipes/JSON_Spirit/json_spirit_v4.08.zip 4.08 adds support for a raw_utf8 option when writing a JSON string. Previously, multibyte characters were being escaped when being sent from cucumber-cpp to cucumber-ruby. Because cucumber-ruby's wire decoder does not properly decode escaped character sequences, this would crash cucumber-ruby.
This modifies the WireResponseEncoder to always use the raw_utf8 option provided by the new version of json_spirit. According to the IETF RFC8259: "JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8 [RFC3629]." https://tools.ietf.org/html/rfc8259
ae6c54e
to
02d74ed
Compare
@paoloambrosio I fixed up the commit updating json_spirit. This should now be resolved. |
74c1c8a
to
615581f
Compare
`cucumber-ruby` expects position values which are based on the index of the codepoint instead of the index of the code unit. This change modifies the value returned to `cucumber-ruby`. Prior to this change, the RegexSubMatch's position, which was correct in terms of a code unit array, would cause an `index out of string` error and crash cucumber-ruby when pretty-printing the results of a test. This commit also ammends the added tests to demonstrate the corrected behavior.
615581f
to
1c0a1bd
Compare
Hi @paoloambrosio, I believe the change you requested has been addressed. Is there anything else you'd like changed before this is merged? Thank you! |
Hi @paoloambrosio, I wanted to try and ping you one more time to check if you'd like any more changes to the branch prior to merging. Thanks! |
Hi @muggenhor, @paoloambrosio, @konserw, this issue has come up again. Would you consider merging this PR? |
I'm not maintaining this repo anymore. From a quick scan it looks good to me. I'll leave it to the new maintainer @jermus67 to review and merge. |
Hi @src-ableton, Thanks for your making your first contribution to Cucumber, and welcome to the Cucumber committers team! You can now push directly to this repo and all other repos under the cucumber organization! 🍾 In return for this generous offer we hope you will:
On behalf of the Cucumber core team, |
@aslakhellesoy Thanks for the invitiation to the organization! I was unable to accept, however, having been on vacation the last three weeks. Could you please re-send the invite? |
@src-ableton done! |
Summary
This branch adds support for handling characters outside of the Unicode block 'Basic Latin' by updating
json-spirit
to the latest version and modifyingcucumber-cpp
to properly handle this change.Motivation and Context
This change adds support for passing raw UTF-8 strings from
cucumber-cpp
tocucumber-ruby
which is necessary when writing tests which use non-'Basic Latin' characters to validate application behavior.This relates to and resolves #40.
How Has This Been Tested?
This branch adds three new tests (5fff48f).
These tests were run following
cucumber-cpp
's own instructions for building and testing.WireProtocolTest
in addition to modifying theWireResponseEncoder::encode
. The modified test, demonstrating correct behavior, passes.Types of changes
Checklist: