Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use borsh to deserialize JIT Artifact #2119

Closed
wants to merge 10 commits into from
Closed

Conversation

ailisp
Copy link
Contributor

@ailisp ailisp commented Feb 12, 2021

Description

This has a little speed up on deserialization JIT Artifact, made a benchmark to store and load a 2MB lib/c-api/tests/assets/qjs.wasm from filesystem cache. To run the bench: cargo bench -p wasmer-cache --features singlepass. Before this PR it is:

store module in filesystem cache                                                                            
                        time:   [15.311 ms 15.356 ms 15.403 ms]
                        change: [-4.4021% -1.2939% +0.7777%] (p = 0.47 > 0.05)
                        No change in performance detected.
Found 2 outliers among 300 measurements (0.67%)
  2 (0.67%) high mild

load module in filesystem cache                                                                            
                        time:   [9.5429 ms 9.5611 ms 9.5803 ms]
                        change: [-1.2390% -0.9352% -0.6680%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 11 outliers among 300 measurements (3.67%)
  7 (2.33%) high mild
  4 (1.33%) high severe


After this PR:

store module in filesystem cache                                                                            
                        time:   [10.879 ms 10.938 ms 11.005 ms]
Found 25 outliers among 300 measurements (8.33%)
  12 (4.00%) high mild
  13 (4.33%) high severe

load module in filesystem cache                                                                            
                        time:   [6.0817 ms 6.0960 ms 6.1113 ms]
Found 6 outliers among 300 measurements (2.00%)
  3 (1.00%) high mild
  3 (1.00%) high severe

Review

  • Add a short description of the the change to the CHANGELOG.md file

@@ -37,9 +39,58 @@ pub struct SerializableCompilation {

/// Serializable struct that is able to serialize from and to
/// a `JITArtifactInfo`.
#[derive(Serialize, Deserialize)]
#[derive(Serialize, Deserialize, BorshSerialize, BorshDeserialize)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[cfg_attr(feature = "enable-borsh", derive(BorshSerialize, BorshDeserialize))]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not under feature, because Serialize and Deserialize was always enabled, whereas crate like lib/compiler/Cargo.toml has optional serde, thus also optional borsh

lib/compiler/src/module.rs Outdated Show resolved Hide resolved
lib/engine-jit/src/artifact.rs Outdated Show resolved Hide resolved
lib/engine-jit/src/artifact.rs Outdated Show resolved Hide resolved
lib/engine/src/serialize.rs Outdated Show resolved Hide resolved
lib/vm/src/module.rs Outdated Show resolved Hide resolved
Comment on lines 53 to 54
// JumpTableOffsets is a SecondaryMap, non trivial to impl borsh
let v = bincode::serialize(&self.function_jt_offsets).map_err(|_| std::io::Error::new(std::io::ErrorKind::InvalidData, "invalid function_jt_offsets"))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? A SecondaryMap is a PrimaryMap without the ability to mutate the keys, which in turn is just a typesafe Vec where the index has a wrapper type instead to keep them distinct instead of plain integers. Is the problem not serialization but deserialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrimaryMap is internally just a vec, but SecondaryMap is more complex to serialize:
https://github.com/bytecodealliance/wasmtime/blob/285a7feb19c1ce768d68fd421eddd1ff6b093d34/cranelift/entity/src/map.rs#L200

More proper way is implement a borshserialize/deserialize similarly to its impl of serde serialize/deserialize

Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally in favor of conditional support for other types of serialization 👍

@@ -20,17 +20,19 @@ hashbrown = { version = "0.9", optional = true }
serde = { version = "1.0", features = ["derive"], optional = true }
thiserror = "1.0"
serde_bytes = { version = "0.11", optional = true }
smallvec = "1.6"
smallvec = "1.6"
borsh = { git = "https://github.com/near/borsh-rs", rev = "c62cdfbd10d4a17fc877809eba4ccb65e866d5f8", optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't merge this while there's a git dependency as it will block us from doing releases on crates.io

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit haven't published, will keep this updated as we publish a new borsh

@ailisp
Copy link
Contributor Author

ailisp commented Feb 24, 2021

@Hywan @MarkMcCaskey just added an benchmark, it's a 30% increase in a larger contract, might be interested to consider borsh. But deserialization is still not fast enough for our use case, so I'm going to try some mmap based serialization approach.

@Hywan
Copy link
Contributor

Hywan commented Feb 25, 2021

it's a 30% increase in a larger contract

Do you mean that serialization is 30% faster?

@ailisp
Copy link
Contributor Author

ailisp commented Feb 25, 2021 via email

@ailisp
Copy link
Contributor Author

ailisp commented Mar 17, 2021

close because #2190 is a faster approach

@ailisp ailisp closed this Mar 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants