Skip to content

Commit

Permalink
src/cpp-common: add bt2c::parseJson() functions (listener mode)
Browse files Browse the repository at this point in the history
This patch adds the bt2c::parseJson() functions in `parse-json.hpp`.

Those functions wrap the file-internal `bt2c::internal::JsonParser`
class of which an instance can parse a single JSON value, calling
specific methods of a JSON event listener as it processes. Internally,
`bt2c::internal::JsonParser` uses a string scanner (`bt2c::StrScanner`).

In searching for a simple JSON parsing solution, I could not find, as of
this date, any project which satisfies the following requirements out of
the box:

• Is well known, well documented, and well tested.

• Has an MIT-compatible license.

• Parses both unsigned and signed 64-bit integers (range
  -9,223,372,036,854,775,808 to 18,446,744,073,709,551,615).

• Provides an exact text location (offset, line number, column number)
  on parsing error (through logging and the message of an error cause).

• Provides an exact text location (offset, line number, column number)
  for each parsed value.

I believe the text locations are essential as this JSON parser will be
used to decode CTF2‑SPEC‑2.0 [1] metadata streams: because Babeltrace 2
will be a reference implementation of CTF 2, it makes sense to make an
effort to pinpoint the exact location of syntactic and semantic errors.

More specifically:

• JSON for Modern C++ (by Niels Lohmann) [2] doesn't support text
  location access, although there's a pending pull request (draft as of
  this date) to add such support [3].

• The exceptions of JsonCpp [4] don't contain a text location, only
  a message.

• SimpleJSON [5] doesn't offer text location access and seems to be an
  archived project.

• RapidJSON [6] doesn't offer text location access.

• yajl [7] could offer some form of text location access (offset, at
  least) with yajl_get_bytes_consumed(), remembering the last offset on
  our side, although I don't know how nice it would play
  with whitespaces.

  That being said, regarding integers, the `yajl_callbacks`
  structure [8] only contains a `yajl_integer` function pointer which
  receives a `long long` value (no direct 64-bit unsigned integer
  support). It's possible to set the `yajl_number` callback for any
  number, but the `yajl_double` callback gets disabled in that case, and
  the callback receives a string which needs further parsing on our
  side: this is pretty much what's implemented `bt2c::StrScanner`
  anyway.

At this point I stopped searching as I already had a working and tested
string scanner and, as you can see, without comments, `parse-json.hpp`
is only 231 lines of effective code and satisfies all the
requirements above.

You can test bt2c::parseJson() with a simple program like this:

    #include <iostream>
    #include <cstring>

    #include "parse-json.hpp"

    struct Printer
    {
        void onNull(const bt2c::TextLoc&)
        {
            std::cout << "null\n";
        }

        template <typename ValT>
        void onScalarVal(const ValT& val, const bt2c::TextLoc&)
        {
            std::cout << val << '\n';
        }

        void onArrayBegin(const bt2c::TextLoc&)
        {
            std::cout << "[\n";
        }

        void onArrayEnd(const bt2c::TextLoc&)
        {
            std::cout << "]\n";
        }

        void onObjBegin(const bt2c::TextLoc&)
        {
            std::cout << "{\n";
        }

        void onObjKey(const std::string& key,
                      const bt2c::TextLoc&)
        {
            std::cout << key << ": ";
        }

        void onObjEnd(const bt2c::TextLoc&)
        {
            std::cout << "}\n";
        }
    };

    int main(const int, const char * const * const argv)
    {
        Printer printer;

        bt2c::parseJson(argv[1], printer);
    }

Then:

    $ ./test-parse-json 23
    $ ./test-parse-json '"\u03c9 represents angular velocity"'
    $ ./test-parse-json '{"salut": [23, true, 42.4e-9, {"meow": null}]}'
    $ ./test-parse-json 18446744073709551615
    $ ./test-parse-json -9223372036854775808

Also try some parsing errors:

    $ ./test-parse-json '{"salut": [false, 42.4e-9, "meow": null}]}'
    $ ./test-parse-json 18446744073709551616
    $ ./test-parse-json -9223372036854775809
    $ ./test-parse-json '"invalid \u8dkf codepoint"'

[1]: https://diamon.org/ctf/CTF2-SPEC-2.0.html
[2]: https://github.com/nlohmann/json
[3]: nlohmann/json#3165
[4]: https://github.com/open-source-parsers/jsoncpp
[5]: https://github.com/nbsdx/SimpleJSON
[6]: https://rapidjson.org/
[7]: https://github.com/lloyd/yajl
[8]: https://lloyd.github.io/yajl/yajl-2.1.0/structyajl__callbacks.html

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: Id32c2b64723ca50b044369c424fe046c0a183cce
Reviewed-on: https://review.lttng.org/c/babeltrace/+/7411
  • Loading branch information
eepp committed Jul 11, 2024
1 parent 443f12b commit 1a8508d
Show file tree
Hide file tree
Showing 2 changed files with 472 additions and 0 deletions.
1 change: 1 addition & 0 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ cpp_common_libcpp_common_la_SOURCES = \
cpp-common/bt2c/libc-up.hpp \
cpp-common/bt2c/logging.hpp \
cpp-common/bt2c/make-span.hpp \
cpp-common/bt2c/parse-json.hpp \
cpp-common/bt2c/prio-heap.hpp \
cpp-common/bt2c/read-fixed-len-int.hpp \
cpp-common/bt2c/regex.hpp \
Expand Down
Loading

0 comments on commit 1a8508d

Please sign in to comment.