[blockchain-v4]: Harden and optimize utxo cache. #3006

davecgh · 2022-10-10T17:27:46Z

This requires #3005.

This is a backport of #2995 to the v4 blockchain release branch. It has been modified slightly to avoid modifying the public API of the blockchain module so a major module version bump is not required.

This makes some minor and non-functional misc changes to improve code consistency.

This modifies the code that tracks the in-flight transactions when determining the input utxos that need to be loaded to pre-allocate the map to avoid some additional overhead associated with growing the map multiple times when a lot of transactions are involved.

This removes the error return from the UtxoCache.addEntry method since it does not return any errors.

This modifies the utxo cache tests to ensure the overridden MaybeFlush method is called for all utxo cache methods as opposed to only when accessed via the UtxoCacher interface. Perhaps most notably, the overridden method is now called when the cache is being initialized which allows its logic to be better tested. It accomplishes this by adding a new closure to the UtxoCache struct that defaults to the exported UtxoCache.MaybeAccept, updates all internal calls to make use of it, and then overrides it when creating the test harness.

Currently, the utxo cache initialize tests don't actually test that the initialize code itself flushes the expected data to the backend rather they do a flush from the tests themselves using the best chain tip which may or may not match whatever initialize actually ordinarily conditionally flushed. This modifies the utxo cache initialize tests force flushing internally when the cache is being initialized accordingly to better ensure it is flushing the expected data.

This corrects an extremely hard to hit corner case involving a mix of manual block invalidation, conditional flushing, and successive unclean shutdowns and adds a test to ensure proper behavior. Specifically, the recovery code from an unclean shutdown that unwinds the cache potentially periodically flushes the state to the backend was incorrectly using the block being disconnected as opposed to its parent which is now the best tip. In practice, this is exceedingly difficult to hit without intentionally trying because it requires a mix of highly unlikely events and manual invalidation: - Two unclean shutdowns at precise inopportune times - Manual invalidation of blocks by the operator such that blocks are only disconnected without reconnecting a side chain - The recovery taking long enough or involving enough entries to trigger a conditional flush, again, at exactly the right time prior to a second unclean shutdown

This modifies the code that deal with fetching input utxos to avoid needlessly looking up inputs for treasurybase and treasury spend transactions since they are both similar to coinbase transactions in that they do not reference any real previous outputs and therefore are guaranteed to result in a cache miss.

This updates the code when connecting transactions in a utxo view to check for treasurybases prior to the coinbase since a treasurybase is also identified as a coinbase, but unlike coinbases, treasurybase outputs are never spendable and thus should not be added to the utxo view.

This simplifies the various utxo cache tests and improves their legibility by consolidating the utxo test entry state creation logic and updating the tests to make use of it accordingly. In particular, it introduces a struct to house entries with the various states along with a method to create them.

This reworks the utxo cache spend entry tests to simplify them and make them more flexible by moving the primary logic for all of the expected results into the test definitions themselves as oppossed to special casing depending on flags in the test body and also adds an additional test case. It also updates the test cache creation method to allow nil entries in the initial cache.

This reworks the utxo cache commit tests to simplify them a bit, make them more flexible, and test a lot more conditions. It accomplishes this by introducing an expected error so the tests themselves can specify the expected result as opposed to assuming it should be nil. It also clones all of the view entries to ensure none of the shared entries are modified when the cache takes ownership of them during the tests. Finally, it improves the robustness of Commit by correcting the logic regarding unmodified spent and unspent fresh view entries discovered while adding the tests since it was technically incorrect even though the relevant cases are never hit in practice in the current implementation. This helps defend against code changes that could easily inadvertently violate the non-obvious, and heretofore undocumented, assumptions that avoided hitting the incorrect behavior.

This reworks the utxo cache add entry tests to simplify them and make them more flexible by moving the primary logic for all of the expected results into the test definitions themselves as oppossed to special casing depending on flags in the test body.

The code prior to this change clones entries from the cache directly into a view which means any cache entries that are marked modified and/or fresh end up in the view with the flags set. This leads to slightly less efficient code and several non-obvious assumptions that are easy to violate when making updates to views or the cache. For example, modified cache entries end up in the view marked as modified which means that even if they aren't actually modified, once the view is committed, those entries end up needlessly updating the cache with the new entry that is actually the exact same as the existing one. Another example is that cache entries that are marked fresh have to retain that flag in the view in order to ensure they remain fresh in the cache once that entry is replaced per the previous point. It is perhaps worth noting that, in practice, there is not typically a lot extra work due to the modified entries since almost all utxos loaded into a view from the cache are ultimately modified in some way which necessarily marks them modified anyway. However, it is still technically incorrect for them to be marked modified before they are actually modified and does lead to some extra work that is not necessary. To address the aforementioned, this modifies the utxo cache to ensure the cache state and view states are separate by updating the addEntry and fetchEntry methods to ensure the modified and spent flags are cleared on cloned entries loaded from the cache and and always set correctly when adding entries to the cache. The result is slightly more efficient code and removal of potential footguns due to non-obvious assumptions about the internals of the view and cache logic.

This modifies the utxo cache commit logic to ensure more robust handling of spent outputs including the addition of a double spend assertion and adds tests to ensure the new behavior works as intended. In particular, it updates the spendEntry method to attempt to load any entries that are not already in the cache from backend which is necessary since lack of an entry does not always guarantee the entry is not in the backend in the case of a hard to hit corner case involving a mix of reorganizations and unclean shutdowns. It also adds a new double spend assertion while here since it's good practice to be paranoid and double check assumptions when it comes to potential double spends. While it is certainly possible to tackle the specific aforementioned case in other ways, this approach was chosen because it is an overall robust solution that ensures proper behavior regardless of other independent assumptions that might be introduced in future code updates.

This splits the logic for handling the connection of transactions from the regular tree and stake tree to a utxo view by introducing individual methods for them and updating the callers accordingly. The primary motivation for this change is that the regular and stake transaction trees do not have identical logic in terms of the ability to spend outputs earlier in the block and it is more efficient to avoid continuously checking for special conditions across all transactions when only certain types can possibly be in each respective tree. Splitting this logic also paves the way for future commits to take advantage of the aforementioned spending semantics to optimize the utxo cache.

This updates the utxo view and cache handling to detect zero confirmation spends in the regular transaction tree (outputs that are spent by the inputs of transactions later in the same tree) and bypass the cache when committing the view. This is desirable because such outputs never exist beyond the scope of the view and therefore are guaranteed to never have any cache entries. In other words, any attempts to spend them in cache will necessarily result in a cache miss. In order to implement the logic, a new state bit flag to track zero confirmation spends is introduced along with code to appropriately set and clear the flag when connecting and disconnection transactions. The bit flag is then used to bypass the cache spend when set. Finally, cache commit tests are included to ensure proper functionality.

davecgh added 17 commits October 10, 2022 11:12

[blockchain-v4] blockchain: Address some linter complaints.

93fb5da

[blockchain-v4] blockchain: Misc consistency cleanup pass.

f031538

This makes some minor and non-functional misc changes to improve code consistency.

[blockchain-v4] blockchain: Remove unused utxo cache add entry err.

fd8c445

This removes the error return from the UtxoCache.addEntry method since it does not return any errors.

davecgh added this to the 1.7.5 milestone Oct 10, 2022

davecgh changed the title ~~Blockchainv4 utxocache coherence~~ [blockchain-v4]: Harden and optimize utxo cache. Oct 10, 2022

jrick approved these changes Oct 10, 2022

View reviewed changes

davecgh merged commit 8208daf into decred:blockchain_v4 Oct 10, 2022

davecgh deleted the blockchainv4_utxocache_coherence branch October 10, 2022 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[blockchain-v4]: Harden and optimize utxo cache. #3006

[blockchain-v4]: Harden and optimize utxo cache. #3006

davecgh commented Oct 10, 2022

[blockchain-v4]: Harden and optimize utxo cache. #3006

[blockchain-v4]: Harden and optimize utxo cache. #3006

Conversation

davecgh commented Oct 10, 2022