Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parsing of numbers within XML-encoded anyxml #2266

Merged

Conversation

jktjkt
Copy link
Contributor

@jktjkt jktjkt commented Jul 6, 2024

This was caught by the libyang-cpp test suite. On Windows/MSVC, an anyxml snippet that was parsed from this XML:

<ax xmlns="http://example.com/coze"><x>1</x><x>2</x><x>3</x></ax>

...would get printed like this:

{"example-schema:ax":{"x":["1","2","3"]}}

...while the correct form is:

{"example-schema:ax":{"x":[1,2,3]}}

This was caused by 21eaa39, a change in lydxml_get_hints_opaq which probably tried to silence a compiler warning. The code wrongly assumed that the C long will be able to hold UINT32_MAX, which is not the case on Windows where long has the same range as int32_t. As a result, when the XML is parsed, these numeric values are compared against -1 on Windows, and against 4294967295 on many other platforms, and therefore any positive number was tagged with the hint of "this needs a 64bit serialization" on Windows, which in turn triggers printing these values as quoted string in JSON.

The fix simply makes sure that we never perform a narrowing conversion from a constant. I was looking for a way of using explicitly sized types (i.e., int64_t in this case), but there's only strtoll in C. The alternative would be sscanf(... SCNd64 ...), but the semantics is subtly different when it comes to whitespace.

The bisecting process was non-trivial for me because the workflow which bypasses CI introduced a test regression (fixed by 7ebd607) and a temporary Windows breakage (fixed by a1ecc16). Instead of a simple git bisect, this required some manual patch backporting.

jktjkt added a commit to Telecominfraproject/oopt-gnpy-libyang that referenced this pull request Jul 6, 2024
...along with CESNET/libyang#2266 which fixes a regression on Windows.
jktjkt added a commit to Telecominfraproject/oopt-gnpy-libyang that referenced this pull request Jul 6, 2024
jktjkt added a commit to Telecominfraproject/oopt-gnpy-libyang that referenced this pull request Jul 7, 2024
jktjkt added a commit to Telecominfraproject/oopt-gnpy-libyang that referenced this pull request Jul 7, 2024
@michalvasko
Copy link
Member

Last devel CI run passed on WIN but this PR does not for some reason, any ideas?

@jktjkt
Copy link
Contributor Author

jktjkt commented Jul 16, 2024

That's breakage due to commit e55443f; I see that 420979e fixes that. Let me rerun the build.

@jktjkt
Copy link
Contributor Author

jktjkt commented Jul 16, 2024

...and I do not see a button to do that :(, maybe you as the project owner can either rebase this or somehow make the job unstuck? It's been listed as "queued" for me for 18 hours.

@michalvasko
Copy link
Member

Never mind that, macOS kept failing and now won't even start. I can restart the other checks, but I though I did already yesterday.

@jktjkt
Copy link
Contributor Author

jktjkt commented Jul 16, 2024

macos 11 won't build ever again as the images were removed, actions/runner-images#10097

@michalvasko
Copy link
Member

Right... not sure how the upgrade will go. As for the WIN checks, they still fail, no idea why.

This was caught by the libyang-cpp test suite. On Windows/MSVC, an
anyxml snippet that was parsed from this XML:

 <ax xmlns="http://example.com/coze"><x>1</x><x>2</x><x>3</x></ax>

...would get printed like this:

 {"example-schema:ax":{"x":["1","2","3"]}}

...while the correct form is:

 {"example-schema:ax":{"x":[1,2,3]}}

This was caused by a change in `lydxml_get_hints_opaq` which probably
tried to silence a compiler warning. The code wrongly assumed that the C
`long` will be able to hold `UINT32_MAX`, which is not the case on
Windows where `long` has the same range as `int32_t`. As a result, when
the XML is parsed, these numeric values are compared against -1 on
Windows, and against 4294967295 on many other platforms, and therefore
any positive number was tagged with the hint of "this needs a 64bit
serialization" on Windows, which in turn triggers printing these values
as quoted string in JSON.

The fix simply makes sure that we never perform a narrowing conversion
from a constant. I was looking for a way of using explicitly sized types
(i.e., int64_t in this case), but there's only `strtoll` in C. The
alternative would be `sscanf(... SCNd64 ...)`, but the semantics is
subtly different when it comes to whitespace.

The bisecting process was non-trivial for me because the workflow which
bypasses CI introduced a test regression (fixed by commit 7ebd607,
"test UPDATE message changed") and a temporary Windows breakage (fixed
by commit a1ecc16, "compat BUGFIX regression in win timegm support").
Instead of a simple `git bisect`, this required some manual patch
backporting.

Fixes: 21eaa39 libyang BUGFIX type size problems
@jktjkt jktjkt force-pushed the fix-windows-anyxml-naked-number-xml-parsing branch from 3839d4a to 47d2611 Compare July 16, 2024 09:31
@michalvasko michalvasko merged commit 863b18b into CESNET:devel Jul 16, 2024
13 checks passed
@jktjkt jktjkt deleted the fix-windows-anyxml-naked-number-xml-parsing branch July 16, 2024 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants