Roundtrip tests are too strict #33

nlohmann · 2016-01-30T16:41:04Z

There is an issue in the roundtrip tests: The current tests check that the output of a library exactly match the input. However, the JSON specification has several ways to specify a number. For instance, number 1e3 could also be written as 1E3, 1e+3, 1E+3, or 1000. Therefore, any library which, reading 1e3 would perform any of these outputs should pass the roundtrip test. Requiring the exact way the number is returned is arbitrary and would be similar to requiring that the order of object keys should remain the same or even requiring whitespace to be preserved.

The current tests are in favor of those libraries which (by chance) decide to serialize numbers with letter "e" rather than with letter "E". This affects files roundtrip24.json--roundtrip27.json. When the exponent is changed (e to E) or the optional + is added, the test results differ.

More details can be found in the original discussion at nlohmann/json#187.

The text was updated successfully, but these errors were encountered:

miloyip · 2016-01-30T17:27:40Z

There are some discussion on #22 as well.

The current rountrip test is:

d = Parse(json)
json2 = Serialize(d)
Test(json == json2)

To make it better, what if we relax it in this way:

d = Parse(json);
json2 = Serialize(R)
d2 = Parse(json2);
json3 = Serialize(d2)
Test(json2 == json3)

So that it just make sure that a library will maintain its own format. This is actually need in some situations.

gregmarr · 2016-01-30T19:48:08Z

I would say that the equality of the numerical form is more important than equality of the string form for floathing point values.

miloyip · 2016-01-31T03:13:41Z

@gregmarr A problem is that, when the parser firstly parses the source JSON, the parsed number may already be incorrect. My proposal above will also have this problem.

miloyip · 2016-01-31T03:39:01Z

@nlohmann Proposal 2: explicitly check the output to handle e/E, and oprtional + in exponent.

miloyip · 2016-01-31T04:45:01Z

I just checked the result of @nlohmann 's library:

roundtrip24: false
Expect: [5e-324]
Actual: [4.94065645841247e-324]

roundtrip25: false
Expect: [2.225073858507201e-308]
Actual: [2.2250738585072e-308]

roundtrip26: false
Expect: [2.2250738585072014e-308]
Actual: [2.2250738585072e-308]

roundtrip27: false.
Expect: [1.7976931348623157e308]
Actual: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.0]

The problem is not related to e/E and optional + in exponent.

These strings are correctly stringified doubles. By "correctness" it means they are convertible back to the exact value of IEEE-754 double, without ambiguity. It is also to be the shortest textual representation.

For example, 2.225073858507201e-308 and 2.2250738585072014e-308 represents two distinct double values 0x000FFFFFFFFFFFFF and 0x0010000000000000. But the library stringify them to a single textual representation 2.2250738585072e-308. Can check these converisons on http://babbage.cs.qc.edu/courses/cs341/IEEE-754.html.

In https://github.com/miloyip/dtoa-benchmark, different conversions routine has been tested.

void dtoa_ostringstream(double value, char* buffer) {
    std::ostringstream oss;
    oss << std::setprecision(17) << value;
    strcpy(buffer, oss.str().c_str());
}

and

void dtoa_sprintf(double value, char* buffer) {
    sprintf(buffer, "%.17g", value);
}

should give "correct" result.

It may be argued that, the JSON spec did not say about subnomal double so it should not be checked as "conformance".

ColinH · 2016-02-13T13:08:50Z

Regarding the setup of the roundtrip tests, as Milo wrote it's currently like this:

json_object = parse( original_json )
generated_json = serialize( json_object )
assert( original_json == generated_json)

Or, using roundtrip( x ) as shortcut for serialise( parse( x ) ), we can write this as:

assert( json == roundtrip( json ) )

Which works pretty well, but as noted, this could fail if some library encoded e.g. floating point exponents differently, i.e. in an allowed way that doesn't change the meaning. Milo also suggested the following solution:

assert( roundtrip( json ) == roundtrip( roundtrip( json ) ) )

This fixes the issue, but let's take a step back. A roundtrip test should check that, when some json input is "round tripped" through a library, the resulting json output is in the same equivalence class as the input, assuming we define two json representations as equivalent if they "encode the same data" (without diving into a formal definition...)

In order to check whether two inputs are equivalent, we can use a roundtrip through a reference implementation as a kind of normal-form generator that always yields the same output representation for each equivalence class of json inputs. In other words, we could use the following approach for the roundtrip conformance test:

assert( reference_roundtrip( json ) == reference_roundtrip( testee_roundtrip( json ) ) )

Unlike the other suggestion, this can't be fooled by a "stupid" or "malicious" library that always serialises to null. However it does not test whether a library keeps to its own format, so perhaps both should be implemented.

nlohmann mentioned this issue Jan 30, 2016

Bugs in miloyip/nativejson-benchmark: roundtrips nlohmann/json#187

Closed

nlohmann mentioned this issue Sep 12, 2016

Conformance report for reference nlohmann/json#307

Closed

miloyip self-assigned this Sep 13, 2016

ColinH mentioned this issue Nov 7, 2017

Add VelocyPack to the native JSON benchmark arangodb/velocypack#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roundtrip tests are too strict #33

Roundtrip tests are too strict #33

nlohmann commented Jan 30, 2016

miloyip commented Jan 30, 2016

gregmarr commented Jan 30, 2016

miloyip commented Jan 31, 2016

miloyip commented Jan 31, 2016

miloyip commented Jan 31, 2016

ColinH commented Feb 13, 2016

Roundtrip tests are too strict #33

Roundtrip tests are too strict #33

Comments

nlohmann commented Jan 30, 2016

miloyip commented Jan 30, 2016

gregmarr commented Jan 30, 2016

miloyip commented Jan 31, 2016

miloyip commented Jan 31, 2016

miloyip commented Jan 31, 2016

ColinH commented Feb 13, 2016