Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to parse single objects at a time #196

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

dominickpastore
Copy link

This pull request is built on top of PR #194 and PR #195 (thus the commits for those branches are showing as well). If those PRs are not accepted but there is interest in this one, I can rebase this branch onto master.

This PR allows parsing one object at a time:

Normally, jsmn parses the entire input string. If multiple objects are present, it will parse all of them consecutively into the tokens array. If the last object is incomplete, jsmn returns JSMN_ERROR_PART, even if there were one or more complete objects before it. This can make parsing a stream of JSON objects difficult. The input reader must ensure the input buffer passed to jsmn ends on an object boundary.

This PR adds a new macro, JSMN_SINGLE, to provide a solution to this. When defined, jsmn will only parse one object at a time. Once it has parsed a complete object, it returns immediately, ignoring the rest of the input string. The parser state will be reinitialized, so to parse the next object, simply advance the input buffer pointer ahead by tokens[0].end and call jsmn_parse() again.

Ensure primitives are "true", "false", "null", or an RFC 8259 compliant
number. (Still need to add test cases.)
String parsing previously did not differ between strict and non-strict
modes, but was not fully compliant with RFC 8259. RFC 8259 requires that
control characters (code points < 0x20) be escaped. This is now enforced
in strict mode. In addition, non-strict mode now does *no* validations
on string contents, much like primitives in non-strict mode.
@dominickpastore
Copy link
Author

Apologize for the history rewrite. Rebased onto the latest changes from the pull requests this was built on top of.

Parent links and strict parsing are now the default behavior. New macros
JSMN_LOW_MEMORY and JSMN_NON_STRICT disable these behaviors.

JSMN_PARENT_LINKS still exists, but is defined by default unless
JSMN_LOW_MEMORY is defined.

JSMN_STRICT no longer exists. Instead, we have three new macros:

JSMN_PERMISSIVE_PRIMITIVES - Relaxes validation of primitives. Any
characters except whitespace and {}[],:" become allowed. (Normally, only
"true", "false", "null", and RFC 8259 numbers are permitted.)

JSMN_PERMISSIVE_STRINGS - Relaxes validation of strings. Any characters
allowed. (Normally, control characters (<0x20) and invalid escape
sequences are foridden.)

JSMN_PRIMITIVE_KEYS - Allows primitives to be used as object keys.

These can be defined individually, or defining JSMN_NON_STRICT will
cause all to be defined.

Tests have not yet been adapted for these changes.
Previously, jsmn parsed all input provided, parsing multiple objects if
present. If the last object is incomplete, it would return
JSMN_ERROR_PART, even if there was at least one complete object before
it. This makes it difficult to parse streams of objects: The input
reader must ensure the input buffer ends on an object boundary.

The JSMN_SINGLE macro provides a solution to this by configuring jsmn to
parse objects one at a time. As soon as a complete object is parsed,
jsmn returns, ignoring the rest of the input. The parser state will be
reinitialized, so to parse the next object, simply advance the input
buffer pointer ahead by tokens[0].end characters and call jsmn_parse()
again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant