Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

fixed broken logs #7934

Merged
merged 6 commits into from
Feb 22, 2018
Merged

fixed broken logs #7934

merged 6 commits into from
Feb 22, 2018

Conversation

debris
Copy link
Collaborator

@debris debris commented Feb 18, 2018

fixes #6637

what caused the bug?

  1. The only place where blockchain's blocks_blooms is updated is in a function blooms_at, line 173. This function updates bloom fetched from db.

https://github.com/paritytech/parity/blob/ff0c44c060e111c6234a47cd1aa8f6411533a202/ethcore/src/blockchain/blockchain.rs#L170-L177

  1. That means, that if there are two consecutive blocks inserted into blockchain, but there was no database transaction flush in the meantime, blooms for new blocks will be created incorrectly. This is caused, by the fact, that our blooms are stored in multiple layers and each layer contain information about blooms from 16 consecutive blocks

  2. Initially I suspected unordered insert to cause invalid log problems, that's why test function is using insert_unordered_block. However, unordered insert is unrelated and log problems occur always when there is no database transaction flush between two consecutive insertions (no matter if ordered or not).

update

I decided to completely remove bloom groups from blockchain database. Block blooms are almost full and merging them is completely redundant.

@debris debris added A0-pleasereview 🤓 Pull request needs code review. B0-patch M4-core ⛓ Core client code / Rust. B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. labels Feb 18, 2018
@debris debris added this to the 1.10 milestone Feb 18, 2018
Copy link
Collaborator

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some help with understanding the entire context.

// These cached values must be updated last with all four locks taken to avoid
// cache decoherence
{
let mut best_block = self.pending_best_block.write();
let mut write_blocks_blooms = self.blocks_blooms.write();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lock order is changed and does not match the order of fields initialization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like in many other places in this file (although I check and locks never interfere) :) But you are right, I will bring back the old lock order, cause this change is not justified.

BlockLocation::BranchBecomingCanonChain(_) => {
// clear all existing blooms, cause they may be created for block
// number higher than current best block
*write_blocks_blooms = update.blocks_blooms;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we override write_blocks_blooms, but we don't really clear anything from the batch. So we may end up writing some additional data to the database, right? I guess it's not harmful?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought:

Why do we actually clear everything from write_blocks_blooms? It's actually a cache of whatever is in the database, no?

So do I understand correctly that previously:

  1. The cache was reflecting what was in the db (since we always did extend_with_cache)
  2. The problem was that some keys were overwritten instead of accrued (that's what the CanonChain branch is doing)
  3. Currently we may end up clearing the cache, but at least the database should contain correct values?

Could you also elaborate on what is actually stored in update.blocks_blooms (what is used as a key and value)?

Copy link
Collaborator Author

@debris debris Feb 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we override write_blocks_blooms, but we don't really clear anything from the batch. So we may end up writing some additional data to the database, right? I guess it's not harmful?

There is nothing in the batch to be cleared.

Why do we actually clear everything from write_blocks_blooms? It's actually a cache of whatever is in the database, no?

cause new BlockLocation is BranchBecomingCanonChain. Database entries need to be updated and so is cache.

  1. The cache was reflecting what was in the db (since we always did extend_with_cache)

No, it was always empty after executing this function, cause UpdatePolicy was Remove.

https://github.com/paritytech/parity/blob/5b4abec2dbc3f5653b2caf2cc481411a29c7d00b/ethcore/src/db.rs#L127-L131

  1. The problem was that some keys were overwritten instead of accrued (that's what the CanonChain branch is doing)

No. The problem was that there were never in cache and not yet in database, cause noone has written transaction batch to it.

  1. Currently we may end up clearing the cache, but at least the database should contain correct values?

We do not care about database, cause everything is in the cache (as it should be).

@tomusdrw tomusdrw added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 19, 2018
@debris debris added A0-pleasereview 🤓 Pull request needs code review. and removed A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. labels Feb 19, 2018
Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my understanding of this:

  • When there's a reorg we need to rewrite the existing blooms since they might already have data from the forked branch
  • When we're already on the canon chain, we need to first check if there's already a bloom for this key and if there is we update the existing bloom. Otherwise, we create it.

Previously we removed the keys from the cache, now we don't touch the cache at all, is this correct?

@andresilva
Copy link
Contributor

This only fixes the issue going forward, so I think people that already have a borked database will still have this issue. We should remember this PR if we see issues related to missing log data in the future.

@debris debris added A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 19, 2018
@codecov-io
Copy link

codecov-io commented Feb 19, 2018

Codecov Report

Merging #7934 into master will decrease coverage by 0.05%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7934      +/-   ##
==========================================
- Coverage    76.6%   76.55%   -0.06%     
==========================================
  Files         658      658              
  Lines       93388    92992     -396     
==========================================
- Hits        71542    71186     -356     
+ Misses      21846    21806      -40
Impacted Files Coverage Δ
whisper/src/net/mod.rs 49.25% <0%> (-9.63%) ⬇️
util/rlp_derive/src/de.rs 52.39% <0%> (-6.49%) ⬇️
secret_store/src/key_server_cluster/cluster.rs 70.99% <0%> (-5.98%) ⬇️
whisper/src/net/tests.rs 78.02% <0%> (-5.5%) ⬇️
util/network/src/host.rs 60.57% <0%> (-3.71%) ⬇️
..._server_cluster/client_sessions/signing_session.rs 87.68% <0%> (-2.85%) ⬇️
...tore/src/key_server_cluster/jobs/key_access_job.rs 93.02% <0%> (-2.33%) ⬇️
ethcore/light/src/net/error.rs 15.21% <0%> (-2.18%) ⬇️
ethcore/evm/src/interpreter/memory.rs 55.71% <0%> (-2.15%) ⬇️
util/network/src/session.rs 72.8% <0%> (-2.12%) ⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01d9bff...b59073b. Read the comment docs.

@debris debris added A0-pleasereview 🤓 Pull request needs code review. and removed A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. labels Feb 21, 2018
@debris
Copy link
Collaborator Author

debris commented Feb 21, 2018

This only fixes the issue going forward, so I think people that already have a borked database will still have this issue. We should remember this PR if we see issues related to missing log data in the future.

Borked database will no longer be a problem

Copy link
Collaborator

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

.into_iter()
.map(|b| b as BlockNumber)
.filter_map(|number| self.block_hash(number).map(|hash| (number, hash)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could possibly be done concurrently, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

@@ -351,11 +333,13 @@ impl BlockProvider for BlockChain {

/// Returns numbers of blocks containing given bloom.
fn blocks_with_bloom(&self, bloom: &Bloom, from_block: BlockNumber, to_block: BlockNumber) -> Vec<BlockNumber> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho would be best to return hashes here, we convert to numbers, then in client we pass numbers to blockchain::logs method that converts to hashes again. Seems pretty wasteful

@tomusdrw tomusdrw added A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 21, 2018
@tomusdrw
Copy link
Collaborator

On a second thought. Should we have a (optional?) migration that removes block groups from DB? Or maybe even a separate binary that you can run to clean up the db?

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We're also using BloomGroup in tracedb so we should probably make sure that everything's correct there. I agree with @tomusdrw that it would be nice to have a migration clean up the database (not sure how hard that is).

@debris
Copy link
Collaborator Author

debris commented Feb 21, 2018

@tomusdrw

On a second thought. Should we have a (optional?) migration that removes block groups from DB? Or maybe even a separate binary that you can run to clean up the db?

Yes, let's clean it up!

@andresilva

LGTM. We're also using BloomGroup in tracedb so we should probably make sure that everything's correct there. I agree with @tomusdrw that it would be nice to have a migration clean up the database (not sure how hard that is).

These are different blooms, but they have the same problem. I'll take care of them after fixing these blooms :)

@5chdn 5chdn merged commit f8a2e53 into master Feb 22, 2018
@5chdn 5chdn deleted the fixed_broken_logs branch February 22, 2018 10:23
@debris debris mentioned this pull request Feb 22, 2018
tomusdrw pushed a commit that referenced this pull request Feb 27, 2018
* fixed broken logs

* bring back old lock order

* removed bloom groups from blockchain

* revert unrelated changes

* simplify blockchain_block_blooms
@tomusdrw tomusdrw mentioned this pull request Feb 27, 2018
10 tasks
5chdn pushed a commit that referenced this pull request Feb 28, 2018
* Hardware-wallet/usb-subscribe-refactor (#7860)

* Hardware-wallet fix

* More fine-grained initilization of callbacks by vendorID, productID and usb class
* Each device manufacturer gets a seperate handle thread each
* Replaced "dummy for loop" with a delay to wait for the device to boot-up properly
* Haven't been very carefully with checking dependencies cycles etc
* Inline comments explaining where shortcuts have been taken
* Need to test this on Windows machine and with Ledger (both models)

Signed-off-by: niklasad1 <niklasadolfsson1@gmail.com>

* Validate product_id of detected ledger devices

* closed_device => unlocked_device

* address comments

* add target in debug

* Address feedback

* Remove thread joining in HardwareWalletManager
* Remove thread handlers in HardwareWalletManager because this makes them unused

* fixed broken logs (#7934)

* fixed broken logs

* bring back old lock order

* removed bloom groups from blockchain

* revert unrelated changes

* simplify blockchain_block_blooms

* Bump WS (#7952)

* Calculate proper keccak256/sha3 using parity. (#7953)

* Increase max download limit to 128MB (#7965)

* fetch: increase max download limit to 64MB

* parity: increase download size limit for updater service

* Detect too large packets in snapshot sync. (#7977)

* fix traces, removed bloomchain crate, closes #7228, closes #7167 (#7979)

* Remvoe generator.rs

* Make block generator easier to use (#7888)

* Make block generator easier to use

* applied review suggestions

* rename BlockMetadata -> BlockOptions

* removed redundant uses of blockchain generator and genereator.next().unwrap() calls
debris added a commit that referenced this pull request Mar 6, 2018
@debris debris mentioned this pull request Mar 6, 2018
andresilva pushed a commit that referenced this pull request Mar 12, 2018
* Revert "fix traces, removed bloomchain crate, closes #7228, closes #7167"

This reverts commit 1bf6203.

* Revert "fixed broken logs (#7934)"

This reverts commit f8a2e53.

* fixed broken logs

* bring back old lock order

* remove migration v13

* revert CURRENT_VERSION to 12 in migration.rs
andresilva pushed a commit that referenced this pull request Mar 16, 2018
* Revert "fix traces, removed bloomchain crate, closes #7228, closes #7167"

This reverts commit 1bf6203.

* Revert "fixed broken logs (#7934)"

This reverts commit f8a2e53.

* fixed broken logs

* bring back old lock order

* remove migration v13

* revert CURRENT_VERSION to 12 in migration.rs
tomusdrw pushed a commit that referenced this pull request Mar 16, 2018
* Revert "fix traces, removed bloomchain crate, closes #7228, closes #7167"

This reverts commit 1bf6203.

* Revert "fixed broken logs (#7934)"

This reverts commit f8a2e53.

* fixed broken logs

* bring back old lock order

* remove migration v13

* revert CURRENT_VERSION to 12 in migration.rs
debris pushed a commit that referenced this pull request Mar 19, 2018
* updater: apply exponential backoff after download failure (#8059)

* updater: apply exponential backoff after download failure

* updater: reset backoff on new release

* Limit incoming connections.  (#8060)

* Limit ingress connections
* Optimized handshakes logging

* Max code size on Kovan (#8067)

* Enable code size limit on kovan

* Fix formatting.

* add some dos protection (#8084)

* more dos protection (#8104)

* Const time comparison (#8113)

* Use `subtle::slices_equal` for constant time comparison.

Also update the existing version of subtle in `ethcrypto` from
0.1 to 0.5

* Test specifically for InvalidPassword error.

* revert removing blooms (#8066)

* Revert "fix traces, removed bloomchain crate, closes #7228, closes #7167"

This reverts commit 1bf6203.

* Revert "fixed broken logs (#7934)"

This reverts commit f8a2e53.

* fixed broken logs

* bring back old lock order

* remove migration v13

* revert CURRENT_VERSION to 12 in migration.rs

* Fix compilation.

* Check one step deeper if we're on release track branches

* add missing pr

* Fix blooms?

* Fix tests compiilation.

* Fix size.
andresilva pushed a commit that referenced this pull request Mar 19, 2018
* Revert "fix traces, removed bloomchain crate, closes #7228, closes #7167"

This reverts commit 1bf6203.

* Revert "fixed broken logs (#7934)"

This reverts commit f8a2e53.

* fixed broken logs

* bring back old lock order

* remove migration v13

* revert CURRENT_VERSION to 12 in migration.rs
tomusdrw pushed a commit that referenced this pull request Mar 19, 2018
* Support parity protocol. (#8035)

* updater: apply exponential backoff after download failure (#8059)

* updater: apply exponential backoff after download failure

* updater: reset backoff on new release

* Max code size on Kovan (#8067)

* Enable code size limit on kovan

* Fix formatting.

* Limit incoming connections.  (#8060)

* Limit ingress connections
* Optimized handshakes logging

* WASM libraries bump (#7970)

* update wasmi, parity-wasm, wasm-utils to latest version

* Update to new wasmi & error handling

* also utilize new stack limiter

* fix typo

* replace dependency url

* Cargo.lock update

* add some dos protection (#8084)

* revert removing blooms (#8066)

* Revert "fix traces, removed bloomchain crate, closes #7228, closes #7167"

This reverts commit 1bf6203.

* Revert "fixed broken logs (#7934)"

This reverts commit f8a2e53.

* fixed broken logs

* bring back old lock order

* remove migration v13

* revert CURRENT_VERSION to 12 in migration.rs

* more dos protection (#8104)

* Const time comparison (#8113)

* Use `subtle::slices_equal` for constant time comparison.

Also update the existing version of subtle in `ethcrypto` from
0.1 to 0.5

* Test specifically for InvalidPassword error.

* fix trace filter returning returning unrelated reward calls, closes #8070 (#8098)

* network: init discovery using healthy nodes (#8061)

* network: init discovery using healthy nodes

* network: fix style grumble

* network: fix typo

* Postpone Kovan hard fork (#8137)

* ethcore: postpone Kovan hard fork

* util: update version fork metadata

* Disable UI by default. (#8105)

* dapps: update parity-ui dependencies (#8160)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing entries in eth_getLogs response
5 participants