Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Check if we broke performance by switching from ujson to simplejson #3009

Closed
ara4n opened this issue Mar 16, 2018 · 3 comments
Closed

Check if we broke performance by switching from ujson to simplejson #3009

ara4n opened this issue Mar 16, 2018 · 3 comments
Assignees

Comments

@ara4n
Copy link
Member

ara4n commented Mar 16, 2018

And whet we'd be better off pinning against a non-broken version of ujson (as apparently the upstream git is fixed).

See #3008

@richvdh
Copy link
Member

richvdh commented Apr 3, 2018

Empirically, no we haven't broken performance, since matrix.org is now using simplejson without a catastrophic decrease in performance.

I have spent some time profiling ujson, simplejson and json; the results look like this:

image

(In the above, simplejson uses simplejson.dumps; simplejson_static creates a static JSONEncoder and JSONDecoder and uses encode and decode).

Conclusions are:

  • on the load (deserialisation) path, simplejson outperforms ujson for large objects and is comparable for small objects. Stock json is much slower than either.
  • on the dump (serialisation) path, the situation isn't as rosy - simplejson is approximately twice as slow as ujson, and 1.5x as slow even as stock json.

Despite that, I think we want to use simplejson anyway. The reasons for this are:

  • I think it's important to use a single encoder/decoder consistently, to avoid disasters like the one we had with Replace ujson with simplejson #3008 (where we were able to deserialise an event, but not serialise it again later).
  • ujson does not appear to be well-maintained. The bug we found was indeed fixed over a year ago, but that has yet to be released (and I am really not keen to depend on a git version of it, since we'd have to ship our own binaries for debian etc)
  • The benchmarks here don't show it, but for some configurations (such as sort_keys=True, as used in canonicaljson), json is way slower than simplejson.
  • For me, simplejson's superior load performance wins over json's superior dump performance. We may also be able to push a few optimisations upstream (see Defer is_raw_json test simplejson/simplejson#212, for example).

As part of this work, I've also discovered that canonicaljson is slow (4x slower than simplejson for canonicaljson 1.0.0, though this is reduced to 1.5x in canonicaljson 1.1.0).

@richvdh
Copy link
Member

richvdh commented Apr 3, 2018

Keeping this open for now, until we merge canonicaljson 1.1.0

@richvdh
Copy link
Member

richvdh commented Apr 12, 2018

Canonicaljson 1.1.2 is now released, and matrix.org is using it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants