-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blockchain, integration, mining, main: Rolling merkle root calculation #1979
Conversation
Pull Request Test Coverage Report for Build 5612329239
💛 - Coveralls |
fd2f936
to
a83bd33
Compare
There was still some Utreexo specific stuff that made it calculate wrong roots. I fixed it in the latest force push. Still the same results from the benchmarks so the benchstat results are valid. EDIT: Later push was for witness merkle root calculation. |
f6c011a
to
99e6b94
Compare
Concept ACK, great work building on the other optimization with some of the utreexo derived fine tuning! |
99e6b94
to
dd13f63
Compare
Ready for reviews now! |
dd13f63
to
80a45b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dope🤙 Excited to see this performance gain! Left some comments, mostly about the new algo.
blockchain/rolling_merkle.go
Outdated
) | ||
|
||
// parentHash returns the hash of the left and right hashes passed in. | ||
func parentHash(l, r chainhash.Hash) chainhash.Hash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the same as HashMerkleBranches
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah you're right, it serves the same functionality.
Though the parentHash
function is faster because there's no allocation being done vs HashMerkleBranches
which does a single allocation. I've tested and verified that this is because HashMerkleBranches
is returning a pointer.
Would it be ok if I were to change the HashMerkleBranches
function to return a chainhash.Hash
instead of a pointer? Then parentHash
and HashMerkleBranches
would be equal.
BenchmarkHashMerkleBranches-10 2279146 526.5 ns/op 32 B/op 1 allocs/op
BenchmarkParentHash-10 2392340 497.1 ns/op 0 B/op 0 allocs/op
I'll modify the
Code used for the above benchmark:
func BenchmarkHashMerkleBranches(b *testing.B) {
var aHash, bHash chainhash.Hash
for i := 0; i < b.N; i++ {
HashMerkleBranches(&aHash, &bHash)
}
}
func BenchmarkParentHash(b *testing.B) {
var aHash, bHash chainhash.Hash
for i := 0; i < b.N; i++ {
parentHash(aHash, bHash)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah why not🤓 Guess pointer stuff is tricky as it puts data on stack, and the performance gain is better in all aspects so I don't see why we shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a commit to change the HashMerkleBranches
to return a chainhash.Hash
. Removed parentHash
.
// allocated based on the passed in size. | ||
func newRollingMerkleTreeStore(size int) rollingMerkleTreeStore { | ||
var alloc int | ||
if size != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the size be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't be 0 since this is only called in CalcMerkleRoot
and the passed in size is len(transactions)
in CalcMerkleRoot
.
I put this check here since if the code were to be called elsewhere with 0, there'd be an overflow. It's not too bad since you'd be allocating 64
for []chainhash.Hash
but I thought it'd be better not to allow for that.
I can remove the branch if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I like this check. Asked that question to gain more understanding and possibly seek ways to make this more defensive. Since this won't result in any error state I think it's ok. But maybe we could add an info
or warning
log here since the performance won't be as good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment on the function in the latest push.
I think it'd be sorta weird to tell the end user why an unexported function is throwing a warning or an info in the logs.
blockchain/rolling_merkle.go
Outdated
} | ||
|
||
// Add on the last tx again if there's an odd number of txs. | ||
if len(adds) > 0 && len(adds)&1 == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what's the difference between 2&2 == 1
and 2%2 != 0
performance-wise🧐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like bitcompare is slightly faster. It's almost the same. Is the modulo more readable? I could change it here if so.
Benchstat results:
goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/blockchain
cpu: AMD Ryzen 5 3600 6-Core Processor
│ modulo.txt │ bitcompare.txt │
│ sec/op │ sec/op vs base │
Odd-10 2.478µ ± 3% 2.464µ ± 2% -0.59% (p=0.041 n=10)
Code used:
+var varbool bool
+
+func BenchmarkOdd(b *testing.B) {
+ for i := 0; i < b.N; i++ {
+ benchmarkOdd(10_000)
+ }
+}
+
+func benchmarkOdd(n int) {
+ for i := 0; i < n; i++ {
+ // varbool = 2&1 == 1
+ varbool = 2%2 != 0
+ }
+}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks for the test! If the performance is the same I'd say we always prefer the more readable approach. Non-blocking tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to %2 == 2
in the latest push
blockchain/rolling_merkle.go
Outdated
|
||
// newRollingMerkleTreeStore returns a rollingMerkleTreeStore with the roots | ||
// allocated based on the passed in size. | ||
func newRollingMerkleTreeStore(size int) rollingMerkleTreeStore { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use uint64
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in the latest push
blockchain/merkle_test.go
Outdated
merkleStoreTree := BuildMerkleTreeStore(block.Transactions(), false) | ||
merkleStoreRoot := merkleStoreTree[len(merkleStoreTree)-1] | ||
|
||
if calcMerkleRoot != *merkleStoreRoot { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use require.Equal
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in the latest push
55da398
to
f4eedd1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I really like this PR. I do think there are certain API changes that are necessary for the RMTS to be broadly useful, rather that solely tailored to the block tree verification process. As I understand it, this PR is ultimately in service of this which is for optimizing sync times, so I am well aware that API ergonomics may not have been an original design goal. I do think that we can take this opportunity to have a broadly useful piece of machinery, though.
The specific changes I am requesting are to clean up the calcMerkleRoot
function so that it doesn't have special provisioning for Bitcoin specific structures and correspondingly allows you to clean up the internal implementation logic and simplifying redundant checks. Everything else I've left comments wise is non-blocking or adds to the documentation to help future readers of this code.
I strongly disagree making this function more broadly useful as the merkle tree function used for calculating the tx commitment is flawed and has a vulnerability (CVE-2012-2459). It's possible to create a collision using duplicate leaves and it's described in detail at: https://github.com/bitcoin/bitcoin/blob/79e8247ddb166f9b980f40249b7372a502402a4d/src/consensus/merkle.cpp#L8-L41 If anyone wants to use the following code for merkle tree stuff, they should just use https://github.com/utreexo/utreexo as it's essentially the same. |
f4eedd1
to
a130abe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed a few things. Left comments on comment changes.
a130abe
to
a450653
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest round looks good🎉 Left some comments re style nits and tests. Think once addressed we are good to go🏖
blockchain/rolling_merkle.go
Outdated
) | ||
|
||
// parentHash returns the hash of the left and right hashes passed in. | ||
func parentHash(l, r chainhash.Hash) chainhash.Hash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah why not🤓 Guess pointer stuff is tricky as it puts data on stack, and the performance gain is better in all aspects so I don't see why we shouldn't.
blockchain/rolling_merkle.go
Outdated
type rollingMerkleTreeStore struct { | ||
// roots are where the temporary merkle roots get stored while the | ||
// merkle root is being calculated. Every root has 2^n leaves and the | ||
// tallest tree is furthest to the left and the shortest tree is furthest to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: line too long, happens at some other places too. Guess we don't enforce the 80 column-size limits in btcd
yet so this is non-blocking. Will submit a PR to enforce it later.
// allocated based on the passed in size. | ||
func newRollingMerkleTreeStore(size int) rollingMerkleTreeStore { | ||
var alloc int | ||
if size != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I like this check. Asked that question to gain more understanding and possibly seek ways to make this more defensive. Since this won't result in any error state I think it's ok. But maybe we could add an info
or warning
log here since the performance won't be as good?
s.add(leaf) | ||
} | ||
|
||
require.Equal(t, s.roots, test.expectedRoots) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets also check s.numLeaves
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added check for s.numLeaves
in the latest push
{0x00}, | ||
}, | ||
}, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to add two more cases, one with 2 leaves and one with 4 leaves. It helps with understanding, also shows how the roots are "rolling".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test cases in the latest push
blockchain/rolling_merkle.go
Outdated
} | ||
|
||
// Add on the last tx again if there's an odd number of txs. | ||
if len(adds) > 0 && len(adds)&1 == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks for the test! If the performance is the same I'd say we always prefer the more readable approach. Non-blocking tho.
mining/mining.go
Outdated
@@ -804,6 +804,38 @@ mempoolLoop: | |||
var witnessCommitment []byte | |||
if witnessIncluded { | |||
witnessCommitment = AddWitnessCommitment(coinbaseTx, blockTxns) | |||
// The witness of the coinbase transaction MUST be exactly 32-bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the whole logical branch into its own method to stop growing the already very long method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that seemed like it was refactored away to a separate function a while ago. It's just bad git rebasing on my end.
Got rid of the entire branch as it's a duplicate of AddWitnessCommitment
BuildMerkleTreeStore used to return a pointer, but it is changed to return a chainhash.Hash directly. This allows the compiler to make optimizations in some cases and avoids a memory allocation.
RollingMerkleTree is a much more memory efficient way of calculating the merkle root of a tx commitment inside the bitcoin block header. The current way of calculating the merkle root allocates 2*N elements. With the RollingMerkleTree, we are able to reduce the memory allocated to log2(N). This results in significant memory savings (99.9% in an average block), allowing for a faster block verification.
CalcMerkleRoot uses the rolling merkle root algorithm to calculate the merkle root commitment inside the Bitcoin block header. It allocates significantly less memory than the BuildMerkleTreeStore function that's currently in use (99.9% in an average block with 2000 txs).
…ore with CalcMerkleRoot
a450653
to
ecfbb7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM⛵️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🌭
Completed sync tests on mainnet+testnet as a final sanity check! |
I saw #1421 and noticed that it was slower than the current function for calculating merkle roots. Just looking to see if the direction I took here is ok.
I replaced everything in
rolling_merkle.go
with Utreexo structs/functions. The comments and struct names should probably change. FWIW, the main functionadd
is using the same algorithm as the one found in blake3 hash.Here's the benchstat results. It's about the same/slightly faster than the current merkle root function and allocates even less memory than the function in #1421. The added code is a lot smaller as well.