Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/no virtual sax #1153

Merged
merged 5 commits into from
Aug 16, 2018
Merged

Conversation

theodelrieu
Copy link
Contributor

@theodelrieu theodelrieu commented Jul 3, 2018

This PR is a WIP, to show a different API for SAX-related functions.

I'd like to have other people running benchmarks, my results are not that significant, even after removing every optimization flag. There is a 40ms improvement on jeopardy.json from time to time though.

Few minor improvements that could be done:

  • Passing by reference instead of pointers?.
  • I removed the no_limit value, I used std::size_t(-1) instead.

As you can see, the public interface stays the same, there is no static_assert on the SAX interface for now. If we decide to go this route, I'll add detailed and useful ones (unlike those that exist today).


Pull request checklist

Read the Contribution Guidelines for detailed information.

  • Changes are described in the pull request, or an existing issue is referenced.
  • The test suite compiles and runs without error.
  • Code coverage is 100%. Test cases can be added by editing the test suite.
  • The source code is amalgamated; that is, after making changes to the sources in the include/nlohmann directory, run make amalgamate to create the single-header file single_include/nlohmann/json.hpp. The whole process is described here.

@coveralls
Copy link

coveralls commented Jul 3, 2018

Coverage Status

Coverage remained the same at 100.0% when pulling 0cc3db4 on theodelrieu:refactor/no_virtual_sax into 3ac2d81 on nlohmann:develop.

@maddanio
Copy link

maddanio commented Jul 4, 2018

ah, funny, i was just wondering about this yesterday, why it has to be virtual :).

@nlohmann nlohmann mentioned this pull request Jul 4, 2018
@nlohmann
Copy link
Owner

nlohmann commented Jul 4, 2018

@theodelrieu Thanks for the PR. I'll run the benchmarks hopefully by the end of the week. I hope there is a larger performance difference - without such, I would still rather go with the base class approach, because I still think it is is much easier for a developer to implement the interface rather than relying on templates here. Let's see what performance has to say about this ;)

@theodelrieu
Copy link
Contributor Author

Well, as you can see in the tests/benchmarks, users don't have to write the word template a single time. The difference might be with the error messages (though we can get away with good static_asserts).

Anyway, going the virtual way can still be changed afterwards if needed. It might be better to have people use SAX for a while and hear their feedbacks.

@nlohmann
Copy link
Owner

nlohmann commented Jul 4, 2018

I really need to look at the code...

@theodelrieu
Copy link
Contributor Author

I would advise adding the flag -fno-devirtualize when building benchmarks, otherwise virtual calls will most likely be inlined, which might hide performance impact.

@theodelrieu
Copy link
Contributor Author

I set my cpu frequence to 100% to avoid latency/noise instead of adding fno-devirtualize

@maddanio
Copy link

maddanio commented Jul 4, 2018

But won’t inlining then still occur?

@theodelrieu
Copy link
Contributor Author

I did split the source files so that the compiler dos not see the derived class definition in every translation unit, it should do the trick (?)

@nlohmann
Copy link
Owner

nlohmann commented Jul 5, 2018

Here are my results:

image

$ ./json_benchmarks --benchmark_filter="ParseFileSAX.*" --benchmark_min_time=30
2018-07-05 22:10:06
Run on (8 X 2900 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 262K (x4)
  L3 Unified 8388K (x1)
***WARNING*** Library was built as DEBUG. Timings may be affected.
--------------------------------------------------------------------------
Benchmark                                   Time           CPU Iterations
--------------------------------------------------------------------------
ParseFileSAXVirtual/jeopardy             2491 ms       2487 ms         17   21.3033MB/s
ParseFileSAXVirtual/canada                156 ms        155 ms        271   13.8122MB/s
ParseFileSAXVirtual/citm_catalog           69 ms         69 ms        605   23.9219MB/s
ParseFileSAXVirtual/twitter                27 ms         27 ms       1540   21.9946MB/s
ParseFileSAXVirtual/floats               1397 ms       1395 ms         30   15.4981MB/s
ParseFileSAXVirtual/signed_ints          1228 ms       1227 ms         33   18.9534MB/s
ParseFileSAXVirtual/unsigned_ints        1208 ms       1206 ms         35   19.2918MB/s
ParseFileSAXTemplate/jeopardy            2467 ms       2463 ms         17    21.513MB/s
ParseFileSAXTemplate/canada               157 ms        157 ms        265   13.7093MB/s
ParseFileSAXTemplate/citm_catalog          70 ms         69 ms        601   23.7028MB/s
ParseFileSAXTemplate/twitter               27 ms         27 ms       1522   22.0489MB/s
ParseFileSAXTemplate/floats              1431 ms       1429 ms         29    15.129MB/s
ParseFileSAXTemplate/signed_ints         1301 ms       1299 ms         32   17.9037MB/s
ParseFileSAXTemplate/unsigned_ints       1282 ms       1280 ms         34   18.1799MB/s

@theodelrieu
Copy link
Contributor Author

Sorry I did tweak the CMakeLists.txt a bit, could you run it in release mode? With my changes it should be:

cmake -G Ninja .. -DCMAKE_BUILD_TYPE=Release

@nlohmann
Copy link
Owner

nlohmann commented Jul 5, 2018

First off, thanks a lot @theodelrieu for pushing this topic! It makes a large difference between just uttering possible interface changes and actually proposing a runnable realization thereof. I really appreciate this and it is always a pleasure working with you!

My 2 cents:

  • I have yet to understand the performance differences. The differences of up to 5 percent for integers are hard to explain. I know that benchmarking is hard, but this feels strange to me. Especially since the difference is rather small (or even negative) for the other files.
  • I may want to try again without forcing the compiler not to inline, etc. The default case (accept or DOM) is provided by the library and will be inlined. The current settings seem overly pessimistic to me - especially with -flto -DNDEBUG -O3 being switched off. I know there may be a negative effect for actual user-provided SAX implementations, but then again - nobody is forced to use these compiler flags. (But I am not an expert in this field).
  • I like the fact that the interface of the functions remained the same and that the user just defines a struct and is done.
  • However, I do like the version with the base class better. It allows for definitions like limit and integration is simple as your IDE guides you in which functions to implement. I doubt that we ever have such a nice support with static asserts or SFINAE. It's hard to explain, but a base class feels more natural to me compared to a concept.

@nlohmann
Copy link
Owner

nlohmann commented Jul 5, 2018

Now the results are more consistent:

image

./json_benchmarks --benchmark_filter="ParseFileSAX.*" --benchmark_min_time=30
2018-07-05 23:00:04
Run on (8 X 2900 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 262K (x4)
  L3 Unified 8388K (x1)
--------------------------------------------------------------------------
Benchmark                                   Time           CPU Iterations
--------------------------------------------------------------------------
ParseFileSAXVirtual/jeopardy              498 ms        497 ms         83   106.574MB/s
ParseFileSAXVirtual/canada                 58 ms         58 ms        726    37.247MB/s
ParseFileSAXVirtual/citm_catalog           13 ms         13 ms       3213   125.259MB/s
ParseFileSAXVirtual/twitter                 5 ms          5 ms       7711   110.439MB/s
ParseFileSAXVirtual/floats                485 ms        484 ms         88   44.6818MB/s
ParseFileSAXVirtual/signed_ints           264 ms        264 ms        158   88.1108MB/s
ParseFileSAXVirtual/unsigned_ints         259 ms        259 ms        161   89.8282MB/s
ParseFileSAXTemplate/jeopardy             490 ms        490 ms         85   108.163MB/s
ParseFileSAXTemplate/canada                57 ms         57 ms        730   37.5386MB/s
ParseFileSAXTemplate/citm_catalog          13 ms         13 ms       3255    125.02MB/s
ParseFileSAXTemplate/twitter                5 ms          5 ms       7818   111.926MB/s
ParseFileSAXTemplate/floats               474 ms        474 ms         88   45.6156MB/s
ParseFileSAXTemplate/signed_ints          262 ms        261 ms        161   88.9603MB/s
ParseFileSAXTemplate/unsigned_ints        257 ms        257 ms        164   90.4741MB/s

@theodelrieu
Copy link
Contributor Author

Thanks a lot :)

Performance is one aspect of why I like this version better, but might not be the most important.

I agree there are some benefits to have a virtual base class, but it's a problem if the library forces users to use this mechanism. However, we can have the best of both worlds.

With little changes to this PR, the library can still provide the base class to help users implement a SAX interface, thus supporting both approaches.

About the no_limit, I am not a fan of special values (-1 here), we could have a poorman's optional instead:

// elements is either nullptr, or a value
bool start_object(const std::size_t* elements) = 0;

About the "natural feeling" of virtual classes, I can understand it for the SAX interface, because their intrusiveness is not that big of a deal on such a specific structure. But had we gone with this approach for user-defined types, I believe we would have received lots of complaints about it (way more than we do right now).

Finally, if we decide to add functionalities to the SAX interface later on, using a Concept is way more flexible:

// json v3.5.1 (example)

bool start_object_better_prototype();

// With SFINAE, we can detect if the new prototype is implemented
// or call the old one if that's not the case. (a bit similar to UDT conversion workflow)
// This is not possible with virtual classes
// (at least without breaking code or adding a new base class)

So even with the proposed changes, if we provide a base class future API extensions will be compromised.

@maddanio
Copy link

maddanio commented Jul 6, 2018

-1 on the poor mans optional: it cannot be called with a literal. Otherwise my 5cts also: do both. Allow static handlers and provide a virtual base also

@theodelrieu
Copy link
Contributor Author

I would agree if this API was designed to be called by users.

However, only the library will call it, so creating a local variable and passing its address is not problematic.

There might be better way to implement this no_limit stuff though.

@nlohmann
Copy link
Owner

I finally had some more time looking at this PR.

  • I now agree that using a SAX API without virtual functions has some benefits. Thanks for your patience along the way 😄... As far as I see, we could still provide a virtual base class that could be used to implement a SAX handler, but which is not necessary. Such a base class would facilitate documenting the whole process. I hope this is what you meant with having the best of both worlds.
  • I don't care too much about the no_limit variable. Proper documentation needs to make clear that -1 means "no limit" and we're fine. I think an optional would mean even more documentation.
  • Do you think it would be possible and useful to have a SFINAE check to reject implementations that do not satisfy the API?
  • I don't like all the adjustment to the benchmarks. I understand they are required for the comparison, but in the end, we should just benchmark the calls to parse or accept.

I think the SAX API should be the "killer feature" for the next release. Once we have this PR out of the way, I think only proper documentation is required.

What do you think?

@theodelrieu
Copy link
Contributor Author

Thanks for your patience along the way 😄

No problem :)

I hope this is what you meant with having the best of both worlds.

Yep, that's the idea.

Do you think it would be possible and useful to have a SFINAE check to reject implementations that do not satisfy the API?

I believe SFINAE is not needed here (unlike constructors), because I fail to see a use-case of "Does this call of parse_sax compile?".

We can go with detailed static_asserts here. I will add a is_sax_interface concept though, as it can be useful for testing/internal uses.

I understand they are required for the comparison, but in the end, we should just benchmark the calls to parse or accept.

I monkey-patched the benchmarks' code for testing purposes, I will remove it.

What do you think?

Looks good, I will try to find time this week to finish the PR.

@theodelrieu
Copy link
Contributor Author

I rebased the PR, and added static_asserts. To see them in action, you can try to remove/modify some of the SAX interfaces in tests (e.g. unit-ubjson.cpp).

Copy link
Owner

@nlohmann nlohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have some minor issues/questions.

@@ -0,0 +1,25 @@
2018-06-29 18:02:43
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be part of the PR. Please remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

@@ -0,0 +1,25 @@
2018-06-29 18:05:46
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be part of the PR. Please remove it.

{
// (void)detail::is_sax_static_asserts<SAX, BasicJsonType>{};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either the line should be removed or it should be un-commented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this should be un-commented

@@ -34,68 +34,68 @@ using nlohmann::json;

#include <fstream>

class SaxCountdown : public nlohmann::json::json_sax_t
class SaxCountdown
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we inheriting here any more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did remove inheritance from tests, but since we decided to expose a base class, I shall add a test using this base class.

I will be able to work on the PR on Thursday

@theodelrieu
Copy link
Contributor Author

Should be good now.

Copy link
Owner

@nlohmann nlohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@nlohmann nlohmann self-assigned this Aug 16, 2018
@nlohmann nlohmann added this to the Release 3.1.3 milestone Aug 16, 2018
@nlohmann nlohmann merged commit d5b21b0 into nlohmann:develop Aug 16, 2018
@theodelrieu theodelrieu deleted the refactor/no_virtual_sax branch August 16, 2018 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants