Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: eliminate bn.js and replace public key storage with JavaScript bigint #27532

Closed
wants to merge 3 commits into from
Closed

fix: eliminate bn.js and replace public key storage with JavaScript bigint #27532

wants to merge 3 commits into from

Conversation

steveluscher
Copy link
Contributor

@steveluscher steveluscher commented Sep 1, 2022

Problem

JavaScript bigint has been available in browsers since September 2020. Despite that we've implemented large value storage in PublicKey using the bn.js library. This library presently makes up ~13% of @solana/web3.js, or around 35K uncompressed.

image

Replacing this system with pure bigint storage would cut the size of this library down considerably.

Summary of Changes

  • Patch borsh to serialize to and from bigint rather than BN instances. A pull request has been sent to the upstream project.
  • Replace BN storage in PublicKey with bigint

Bundle size

Using package-build-stats:

const p = require('package-build-stats');
p.getPackageStats('~/solana/web3.js/').then(r => console.log(r))

Size before: 75299 gzipped bytes
Size after: 64191 gzipped bytes

Reduction of ~14%.

Notes

  • Anyone who used to rely on accessing the private variable PublicKey::_bn will find that its type has changed from BN to bigint. Such is the risk when you reach into private variables.
  • This change relies on patching borsh-js in a way that represents a breaking change. For this reason we modify the bundle script to bundle the modified version into the web3.js bundle. This has two implications:
    • An app that uses borsh outside of web3.js will end up bundling a second copy of it (presumably one that includes 'bn.js` which negates the benefits of this change)
    • Our patched version is accessible only within the confines of web3.js and will not cause a breaking change to the rest of the app.

Addresses solana-labs/solana-web3.js#1103.

@codecov
Copy link

codecov bot commented Sep 1, 2022

Codecov Report

Merging #27532 (1489c4b) into master (4267a15) will decrease coverage by 0.8%.
The diff coverage is n/a.

@@            Coverage Diff            @@
##           master   solana-labs/solana#27532     +/-   ##
=========================================
- Coverage    77.1%    76.2%   -0.9%     
=========================================
  Files          55       54      -1     
  Lines        2934     3129    +195     
  Branches      408      474     +66     
=========================================
+ Hits         2264     2387    +123     
- Misses        529      574     +45     
- Partials      141      168     +27     

@steveluscher
Copy link
Contributor Author

steveluscher commented Sep 1, 2022

I'm not 100% sure this is a slam dunk. I'm looking for feedback.

In particular, I think this change will break other libraries that pile onto SOLANA_SCHEMA because of their expectation that data will _de_serialize to BN instances.

Example:

https://github.com/jackcmay/solana-program-library/blob/5aa3b98f2c4d6d84ac5b56f603b891b3b979af62/stake-pool/js/src/schema.ts#L30-L32

Although I guess what I said about our patch being isolated from the rest of the app implies that they would keep working, since they use their own mainline version of borsh

@steveluscher steveluscher added the javascript Pull requests that update Javascript code label Sep 1, 2022
Copy link
Contributor

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great for the most part, mainly a question on the internal representation

web3.js/patches/borsh+0.7.0.patch Outdated Show resolved Hide resolved
Comment on lines -104 to -107
'bigint-buffer',
'bn.js',
'borsh',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these be added back once your change lands in borsh? If so, do you mind adding a GitHub issue or comment to that effect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right! I will.

this._bn = value._bn;
if (typeof value._bn === 'object') {
// Legacy implementation of public key storage as a `BN` instance.
this._bn = toBigIntLE(value._bn.toBuffer('le', 32));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change on the internal implementation, because before bn would default to big-endian: https://github.com/indutny/bn.js/blob/5df40f81ea8afb835b909bb7c21e0833cdeb6a30/lib/bn.js#L39, which caused me a lot of annoyance.

This might be ok since people shouldn't depend on the internal representation, but for this case, if they do provide a legacy bn, maybe we might want to treat the input as big-endian, and then convert it to little-endian here. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK, actually.

  • Someone supplies a BN instance to the constructor (endianness is irrelevant at that point)
  • We convert it to an LE buffer
  • We convert that LE buffer to a bigint using toBigIntLE (which implements ‘convert this LE buffer to a bigint’)

Now, the internal representation is a bigint, which doesn't have an endianness – it just exists.


The way in which this is a breaking change is for anyone who ever reaches into PublicKey::_bn. We never marked that TypeScript private so there has never been and still is no compile time protection against someone doing so. If we ship this, they will be surprised to find that _bn is no longer a BN but a bigint, and their code will break.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh duh, of course, you're right, sorry 🤦 I got confused with using borsh encoding that just treated the underlying as little-endian always, which isn't an issue here because you're doing toBuffer('le'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we ship this, they will be surprised to find that _bn is no longer a BN but a bigint

What if we change the name and get rid of the internal _bn property, or throw when it's accessed with a getter? This will cause access of internals to fail faster, rather than experience a larger range of errors where they get the value and pass it to something else or try to call a method on it that doesn't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could do that. We'll have to add something to the constructor which allows PublicKey to be part of the PublicKeyInitData union, because people have gotten used to doing this:

// Like, I've actually seen this.
new PublicKey(new PublicKey(...));

Right now that works because PublicKey just happens to conform to {_bn: BN} because, damnit, _bn was never marked private.

Comment on lines +77 to +81
BigInt(
'115792089237316195423570985008687907853269984665640564039457584007913129639936',
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about making a const for this, ie. PUBLIC_KEY_MAXIMUM_VALUE and making it this number - 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally avoid constants, because they make people have to hop around the code. I avoid them doubly when there's literally only one place they're needed.

wrt the - 1 idea, I also generally don't want to make the runtime do math when it doesn't have to.

Copy link
Contributor

@joncinque joncinque Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to define it right there, so no hopping, but it's no big deal. And for -1, I meant encoding the string as 11579...9935 instead, so it would really mean PUBLIC_KEY_MAXIMUM_VALUE

@steveluscher
Copy link
Contributor Author

This PR breaks Connection::getTokenSupply somehow, so I guess it has a ways to go still.

@github-actions github-actions bot added the web3.js Related to the JavaScript client label Dec 22, 2022
Copy link
Contributor

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you don't mind the force-push to rebase, but otherwise it's just the little change that I've outlined, no rush at all on this.

Comment on lines +123 to +129
if (typeof publicKey._bn === 'bigint') {
return this._bn === publicKey._bn;
} else {
// If it's not a bigint, the other type comes from an old library, so we
// just check the underlying buffer representation.
return this.toBuffer().equals(publicKey.toBuffer());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the issue with the tests! Since we were getting an old PublicKey from spl-token, this needs to be more permissive with the checks. Uint8Array doesn't have an equality check, so this seemed like the safest way to check equality between BN and bigint, but could also future proof us in case we change the underlying representation again

@steveluscher
Copy link
Contributor Author

Thanks for the fix @joncinque! I'd love to smash this in. Do we need to do a little bit of a think on this, given that borsh-js ultimately rejected my PR, meaning that there's no path forward to mainline the bigint change into Borsh without making our own fork?

@steveluscher
Copy link
Contributor Author

Migrated to solana-labs/solana-web3.js#1142.

@steveluscher steveluscher deleted the lighter-borsh branch February 2, 2023 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
javascript Pull requests that update Javascript code web3.js Related to the JavaScript client
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants