Skip to content

Commit

Permalink
Use nlohmann::position_t instead of lexer for detailed position infor…
Browse files Browse the repository at this point in the history
…mation when using a sax parser
  • Loading branch information
barcode committed Dec 23, 2022
1 parent e90d005 commit db06dab
Show file tree
Hide file tree
Showing 15 changed files with 330 additions and 144 deletions.
17 changes: 7 additions & 10 deletions docs/examples/sax_parse_with_src_location_in_json.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,13 @@ using json = nlohmann::json;
// allows us to store metadata and add custom methods to each node
struct token_start_stop
{
nlohmann::detail::position_t start{};
nlohmann::detail::position_t stop{};
nlohmann::position_t start{};
nlohmann::position_t stop{};

std::string start_pos_str() const
{
return "{l=" + std::to_string(start.lines_read) + ":c="
//the lexer is already one char ahead (e.g. the opening { of an object )
+ std::to_string(start.chars_read_current_line - 1) + "}";
+ std::to_string(start.chars_read_current_line) + "}";
}
std::string stop_pos_str() const
{
Expand Down Expand Up @@ -68,16 +67,14 @@ class sax_with_token_start_stop_metadata
, start_stop{}
{}

template<class T1, class T2>
void next_token_start(const nlohmann::detail::lexer<T1, T2>& lex)
void next_token_start(const nlohmann::position_t& p)
{
start_stop.start = lex.get_position();
start_stop.start = p;
}

template<class T1, class T2>
void next_token_end(const nlohmann::detail::lexer<T1, T2>& lex)
void next_token_end(const nlohmann::position_t& p)
{
start_stop.stop = lex.get_position();
start_stop.stop = p;
}

bool null()
Expand Down
25 changes: 9 additions & 16 deletions docs/mkdocs/docs/api/json_sax/next_token_end.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,23 @@ There are two possible signatures for this method:
```cpp
void next_token_end(std::size_t pos);
```
This version is called with the byte position after the next element ends. This version also works when parsing binary formats such as [msgpack](../basic_json/input_format_t.md).
This version is called with the byte position after the next element ends.
This version also works when parsing binary formats such as [msgpack](../basic_json/input_format_t.md).
2.
```cpp
template<class BasicJsonType, class InputAdapterType>
void next_token_end(const nlohmann::detail::lexer<BasicJsonType, InputAdapterType>& lex)
void next_token_end(const nlohmann::position_t& p)
```
This version is called with the lexer after the last character of the next element was parsed. The lexer can provide additional information about the current parse context. This version only available when calling `nlohmann::json::sax_parse` with `nlohmann::json::input_format_t::json` and takes precedence.

## Template parameters
1.
(none)
2.
`BasicJsonType`
: a specialization of `basic_json` used by the lexer. (Leave this as a template parameter)
`InputAdapterType`
: The input adapter used by the lexer. (Leave this as a template parameter)
This version is called with the [detailed parser position information](../position_t/index.md) after the last character of the next element was parsed.
This version only available when calling `nlohmann::json::sax_parse` with `nlohmann::json::input_format_t::json` and takes precedence.

## Parameters
1.
`pos` (in)
: Byte position one after the next elements last byte.
2.
`lex` (in)
: Lexer after the last char of the next element was parsed.
`p` (in)
: [Detailed parser position information](../position_t/index.md) after the last char of the next element was parsed.

## Notes

Expand All @@ -57,7 +49,8 @@ It is recommended, but not required, to also implement [next_token_start](next_t

??? example

The example below shows a SAX parser using the second version of this method and storing the location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.
The example below shows a SAX parser using the second version of this method and
storing the location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.

```cpp
--8<-- "examples/sax_parse_with_src_location_in_json.cpp"
Expand Down
25 changes: 9 additions & 16 deletions docs/mkdocs/docs/api/json_sax/next_token_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,23 @@ There are two possible signatures for this method:
```cpp
void next_token_start(std::size_t pos);
```
This version is called with the byte position where the next element starts. This version also works when parsing binary formats such as [msgpack](../basic_json/input_format_t.md).
This version is called with the byte position where the next element starts.
This version also works when parsing binary formats such as [msgpack](../basic_json/input_format_t.md).
2.
```cpp
template<class BasicJsonType, class InputAdapterType>
void next_token_start(const nlohmann::detail::lexer<BasicJsonType, InputAdapterType>& lex)
void next_token_start(const nlohmann::position_t& p)
```
This version is called with the lexer after the first character of the next element was parsed. The lexer can provide additional information about the current parse context. This version only available when calling `nlohmann::json::sax_parse` with `nlohmann::json::input_format_t::json` and takes precedence.

## Template parameters
1.
(none)
2.
`BasicJsonType`
: a specialization of `basic_json` used by the lexer. (Leave this as a template parameter)
`InputAdapterType`
: The input adapter used by the lexer. (Leave this as a template parameter)
This version is called with [detailed parser position information](../position_t/index.md).
This version only available when calling `nlohmann::json::sax_parse` with `nlohmann::json::input_format_t::json` and takes precedence.

## Parameters
1.
`pos` (in)
: Byte position where the next element starts.
2.
`lex` (in)
: Lexer after the first char of the next element was parsed.
`p` (in)
: [Detailed parser position information](../position_t/index.md) after the first char of the next element was parsed.

## Notes

Expand All @@ -57,7 +49,8 @@ It is recommended, but not required, to also implement [next_token_end](next_tok

??? example

The example below shows a SAX parser using the second version of this method and storing the location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.
The example below shows a SAX parser using the second version of this method and
storing the location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.

```cpp
--8<-- "examples/sax_parse_with_src_location_in_json.cpp"
Expand Down
28 changes: 28 additions & 0 deletions docs/mkdocs/docs/api/position_t/chars_read_current_line.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# <small>nlohmann::position_t::</small>chars_read_current_line

```cpp
std::size_t chars_read_current_line;
```

The number of characters read in the current line.

## Examples

??? example

The example below shows a SAX receiving the element bounds as `nlohmann::position_t` and
storing this location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.

```cpp
--8<-- "examples/sax_parse_with_src_location_in_json.cpp"
```

Output:

```json
--8<-- "examples/sax_parse_with_src_location_in_json.output"
```

## Version history

- Moved from namespace `nlohmann::detail` to `nlohmann` in version ???.???.???.
28 changes: 28 additions & 0 deletions docs/mkdocs/docs/api/position_t/chars_read_total.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# <small>nlohmann::position_t::</small>chars_read_total

```cpp
std::size_t chars_read_total;
```

The total number of characters read.

## Examples

??? example

The example below shows a SAX receiving the element bounds as `nlohmann::position_t` and
storing this location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.

```cpp
--8<-- "examples/sax_parse_with_src_location_in_json.cpp"
```

Output:

```json
--8<-- "examples/sax_parse_with_src_location_in_json.output"
```

## Version history

- Moved from namespace `nlohmann::detail` to `nlohmann` in version ???.???.???.
23 changes: 23 additions & 0 deletions docs/mkdocs/docs/api/position_t/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# <small>nlohmann::</small>position_t

```cpp
struct position_t;
```

This type represents the parsers position when parsing a json string using.
This position can be retrieved when using a [sax parser](../json_sax/index.md) with the format `nlohmann::json::input_format_t::json`
and implementing [next_token_start](../json_sax/next_token_start.md) or [next_token_end](../json_sax/next_token_end.md).

## Member functions

- [**operator size_t**](operator_size_t.md) - return the value of [chars_read_total](chars_read_total.md).

## Member variables

- [**chars_read_total**](chars_read_total.md) - The total number of characters read.
- [**lines_read**](lines_read.md) - The number of lines read.
- [**chars_read_current_line**](chars_read_current_line.md) - The number of characters read in the current line.

## Version history

- Moved from namespace `nlohmann::detail` to `nlohmann` in version ???.???.???.
28 changes: 28 additions & 0 deletions docs/mkdocs/docs/api/position_t/lines_read.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# <small>nlohmann::position_t::</small>lines_read

```cpp
std::size_t lines_read;
```

The number of lines read.

## Examples

??? example

The example below shows a SAX receiving the element bounds as `nlohmann::position_t` and
storing this location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.

```cpp
--8<-- "examples/sax_parse_with_src_location_in_json.cpp"
```

Output:

```json
--8<-- "examples/sax_parse_with_src_location_in_json.output"
```

## Version history

- Moved from namespace `nlohmann::detail` to `nlohmann` in version ???.???.???.
28 changes: 28 additions & 0 deletions docs/mkdocs/docs/api/position_t/operator_size_t.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# <small>nlohmann::position_t::</small>

```cpp
constexpr operator size_t() const;
```

return the value of [chars_read_total](chars_read_total.md).

## Examples

??? example

The example below shows a SAX receiving the element bounds as `nlohmann::position_t` and
storing this location information in each json node using a [base class](../basic_json/json_base_class_t.md) for `nlohmann::json` as customization point.

```cpp
--8<-- "examples/sax_parse_with_src_location_in_json.cpp"
```

Output:

```json
--8<-- "examples/sax_parse_with_src_location_in_json.output"
```

## Version history

- Moved from namespace `nlohmann::detail` to `nlohmann` in version ???.???.???.
24 changes: 24 additions & 0 deletions docs/mkdocs/docs/features/parsing/sax_interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,30 @@ To implement your own SAX handler, proceed as follows:
Note the `sax_parse` function only returns a `#!cpp bool` indicating the result of the last executed SAX event. It does not return `json` value - it is up to you to decide what to do with the SAX events. Furthermore, no exceptions are thrown in case of a parse error - it is up to you what to do with the exception object passed to your `parse_error` implementation. Internally, the SAX interface is used for the DOM parser (class `json_sax_dom_parser`) as well as the acceptor (`json_sax_acceptor`), see file `json_sax.hpp`.
## Element position information
The position of a parsed element can be retrieved by implementing the optional methods [next_token_start](../../api/json_sax/next_token_start.md) and [next_token_end](../../api/json_sax/next_token_end.md).
These methods will be called with the parser position before any of the other methods are called and can be used to retrieve the half open bounds (`[start, end)`) of a parsed element.
These Methods come in two flavors:
1.
```cpp
void next_token_start(std::size_t pos);
void next_token_end(std::size_t pos);
```
This flavor is called with the byte positions of each element and are available for any `nlohmann::json::input_format_t` passed to `nlohmann::json::sax_parse`.

2.
```cpp
void next_token_start(const nlohmann::position_t& p);
void next_token_end(const nlohmann::position_t& p);
```
This flavor is called with the [detailed parser position information](../../api/position_t/index.md) of each element and are only available if `nlohmann::json::sax_parse` is called with `nlohmann::json::input_format_t::json`.
Furthermore this flavor takes precedence over the first flavor.
Depending on the required information it is possible for the SAX parser to implement all four or only one or none of these methods.
## See also
- [json_sax](../../api/json_sax/index.md) - documentation of the SAX interface
Expand Down
8 changes: 8 additions & 0 deletions docs/mkdocs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -248,13 +248,21 @@ nav:
- 'start_array': api/json_sax/start_array.md
- 'start_object': api/json_sax/start_object.md
- 'string': api/json_sax/string.md
- 'next_token_start' : api/json_sax/next_token_start.md
- 'next_token_end' : api/json_sax/next_token_end.md
- 'operator<<(basic_json)': api/operator_ltlt.md
- 'operator<<(json_pointer)': api/operator_ltlt.md
- 'operator>>(basic_json)': api/operator_gtgt.md
- 'operator""_json': api/operator_literal_json.md
- 'operator""_json_pointer': api/operator_literal_json_pointer.md
- 'ordered_json': api/ordered_json.md
- 'ordered_map': api/ordered_map.md
- position_t:
- 'Overview': api/position_t/index.md
- 'operator size_t': api/position_t/operator_size_t.md
- 'chars_read_total': api/position_t/chars_read_total.md
- 'lines_read': api/position_t/lines_read.md
- 'chars_read_current_line': api/position_t/chars_read_current_line.md
- macros:
- 'Overview': api/macros/index.md
- 'JSON_ASSERT': api/macros/json_assert.md
Expand Down
5 changes: 0 additions & 5 deletions include/nlohmann/detail/input/position_t.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,6 @@
#include <nlohmann/detail/abi_macros.hpp>

NLOHMANN_JSON_NAMESPACE_BEGIN
namespace detail
{

/// struct to capture the start position of the current token
struct position_t
{
Expand All @@ -32,6 +29,4 @@ struct position_t
return chars_read_total;
}
};

} // namespace detail
NLOHMANN_JSON_NAMESPACE_END
Loading

0 comments on commit db06dab

Please sign in to comment.