Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove JSON string escaping in diff messages #598

Merged
merged 5 commits into from
Jan 18, 2022
Merged

Conversation

cycraig
Copy link
Contributor

@cycraig cycraig commented Jan 17, 2022

Description of change

Removes JSON string escaping in DiffMessage::diff and ServiceEndpoint diffs. This reduces the size of DiffMessages on the Tangle by ~6%. This reduction has diminishing returns for larger diff messages, going to ~1% for 2KB and <1% for 3KB+, but considering we expect most diffs to be <1KB this would still have an impact.

WARNING: this is a breaking change and will invalidate all prior diff chain updates. Advise developers using 0.5.0-dev.1/0.5.0-dev.2 to publish an integration update with all diffs applied to prevent unexpected changes.

Advantages

  • Reduces the size of Tangle messages slightly, reducing data transmission and long-term storage costs.
  • No unescaping pre-processing required in the identity-resolver on the explorer website.

Disadvantages

  • Requires deserializing the IotaDiffDocument up-front for every DiffMessage processed when resolving a DID (rather than only after the signature is validated). This could slow down resolution when there are many conflicting or spam messages on the diff chain but the effect is capped since Tangle message sizes have a reasonable upper-limit. Benchmarking (see section below after the example) shows this effect is significant but only in malicious cases since deserialization is always performed anyway for valid DiffMessages. Would not be an issue with planned spam solutions.

Example:

A medium-sized DiffMessage which adds a service and a verification method.

# Bytes JSON Brotli+JSON
Escaped (current) 967 562
Unescaped (this PR) 898 526
Change -69 -36
Percent Change 7.1% 6.4%
  1. Current: stores an escaped JSON string as the diff (formatted for display).
{
  "id": "did:iota:3ZW4YC6jhPBzBE5devGRnPbrhkJJWyxT1mXRRp11hLVX",
  "diff": "{\"doc\":{\"assertion_method\":[{\"id\":\"did:iota:3ZW4YC6jhPBzBE5devGRnPbrhkJJWyxT1mXRRp11hLVX#newKey\",\"controller\":\"did:iota:3ZW4YC6jhPBzBE5devGRnPbrhkJJWyxT1mXRRp11hLVX\",\"key_type\":\"Ed25519VerificationKey2018\",\"key_data\":{\"PublicKeyMultibase\":\"z2sASuHXvGRLRwoVbnK5aAJGeC3ZQdTt8x6q42ZJtUGFr\"},\"properties\":null}],\"service\":[{\"id\":\"did:iota:3ZW4YC6jhPBzBE5devGRnPbrhkJJWyxT1mXRRp11hLVX#linked-domain-1\",\"type_\":\"LinkedDomains\",\"service_endpoint\":\"{\\\"origins\\\":[\\\"https://iota.org/\\\",\\\"https://example.com/\\\"]}\",\"properties\":null}]},\"meta\":{\"updated\":\"2022-01-17T16:18:04Z\"}}",
  "previousMessageId": "3f76268decb4556dcb99558cab00c17582916a1efcc709135d2cf8654e486338",
  "proof": {
    "type": "JcsEd25519Signature2020",
    "verificationMethod": "#sign-0",
    "signatureValue": "5eb1HDuaw88EFVa2JgZnuxd1j8nSDAq9Ck7YMLohNatrQ7N3xxmYfHw1Fevd5r2fNrrLzyCDAe9YVdWtBH4a85ic"
  }
}
  1. This PR: stores an IotaDiffDocument for the diff (formatted for display).
{
  "id": "did:iota:7fPaHLV69QFQG7GxU9RdygsF2v2PhnbFGndirc67Gd5a",
  "diff": {
    "doc": {
      "assertion_method": [
        {
          "id": "did:iota:7fPaHLV69QFQG7GxU9RdygsF2v2PhnbFGndirc67Gd5a#newKey",
          "controller": "did:iota:7fPaHLV69QFQG7GxU9RdygsF2v2PhnbFGndirc67Gd5a",
          "key_type": "Ed25519VerificationKey2018",
          "key_data": {
            "PublicKeyMultibase": "zHG4szALvApBm1TRSjQsqySLmZ7LWUqk1PpxesLKxn6e2"
          },
          "properties": null
        }
      ],
      "service": [
        {
          "id": "did:iota:7fPaHLV69QFQG7GxU9RdygsF2v2PhnbFGndirc67Gd5a#linked-domain-1",
          "type_": "LinkedDomains",
          "service_endpoint": {
            "origins": [
              "https://iota.org/",
              "https://example.com/"
            ]
          },
          "properties": null
        }
      ]
    },
    "meta": {
      "updated": "2022-01-17T16:21:44Z"
    }
  },
  "previousMessageId": "cecb1b3ccd662372ed861092ca0b382365e0fae33f7e20b75c4040b0a36c1711",
  "proof": {
    "type": "JcsEd25519Signature2020",
    "verificationMethod": "#sign-0",
    "signatureValue": "isnC2ghEjjq5hJADwQumu6R4zYyFVDhv9z375zgYhbPZz5CrdXYQiYLmaE3SefVaJy2p2huXu5zW8ctJNFZ5br5"
  }
}

Benchmarking Decompression+Deserialization

image

Benchmarking decompressing then deserializing a DiffMessage encoded as JSON with Brotli compression with criterion demonstrates the significant overhead of JSON deserialization (unescaped - this PR) compared to deserialising a string field (escaped - current). For context, the data was generated by adding [1, 10, 100, 1000, 10_000, 50_000] services to a DID Document.

Note that this is not a performance regression since the difference demonstrated is simply the overhead of immediate JSON deserialisation. The orange line (escaped - current) simply defers the JSON deserialization of the diff field to the merge step.

This is only a problem if we have to deserialize many large diff messages on an index (from malicious actors for example), which should hopefully be solved by other spam prevention changes. Spamming large integration chain messages would have the same effect currently.

Type of change

  • Bug fix (a non-breaking change which fixes an issue)
  • Enhancement (a non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Fix

How the change has been tested

Local tests and examples pass in both Rust and Wasm.

Change checklist

  • I have followed the contribution guidelines for this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@cycraig cycraig added Wasm Related to Wasm bindings. Becomes part of the Wasm changelog Breaking change A change to the API that requires a major release. Part of "Changed" section in changelog Rust Related to the core Rust code. Becomes part of the Rust changelog. labels Jan 17, 2022
@cycraig cycraig marked this pull request as ready for review January 17, 2022 20:44
Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Collaborator

@JelleMillenaar JelleMillenaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I love how you went well over everything for this PR. Updated API reference, updated method and even added a release description.

Copy link
Collaborator

@eike-hass eike-hass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff, real improvement I think!

@cycraig cycraig merged commit 8b34190 into dev Jan 18, 2022
@cycraig cycraig deleted the fix/diff-escaping branch January 18, 2022 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking change A change to the API that requires a major release. Part of "Changed" section in changelog Rust Related to the core Rust code. Becomes part of the Rust changelog. Wasm Related to Wasm bindings. Becomes part of the Wasm changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants