Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BlockChain<T>.GetStates() returns early for nonexistent addresses #197

Merged

Conversation

dahlia
Copy link
Contributor

@dahlia dahlia commented Apr 12, 2019

This patch completely fixes the bug #189. Besides the patch #192, this deals with nonexistent addresses.

This involves a mask-based index to approximates a set of Addresses. The key idea is that if a certain address had been ever used and included into the mask, the mask ensures to have only non-zero bits (i.e., trues) for the same positions that the address has non-zero bits. You can also think of it as a kind of bloom filter but no hash functions. The point is that although a mask cannot ensure if an address is in it but can ensure an address is definitely not in it.

To store a mask for each block, I added a new parameter Address addressesMask to IStore.PutBlock<T>() method, and a new method named IStore.GetAddressesMask() to query this. I wrote XML comments on these methods. Please read comments for API details.

Although I added a mask as an ad-hoc field of IStore at this stage, I suggest put this information into Block<T> type for the future so that mask is also validated within block's cryptographic hash. (More detailed proposal is in the commit message of e01da78.)

Note: Although I added an ad-hoc field separated from Block type,
this leads the order of storing multiple blocks to be forced
unless there are information about corresponding addresses masks
for these blocks to put in.  (A mask for a block needs to be
made on top of the mask for its previous block.)

For example, if we want to retrieve a large amount of blocks
out of order, we need to serialize the order of block insertions,
from the genesis block to the topmost block, or retrieve pairs
of block and its mask.  Even if we retrieve blocks along with
their masks, as masks are neither signed nor hashed with nonce,
we implement extra validations for masks.  It doesn't make only
dealing with blocks complex and painful, but also these jobs
bug-prone.

So I suggest to put the mask field into the Block type,
so that it is fully integrated into blockchain and all
things are validated along with, in a consistent way.
@dahlia dahlia added the bug Something isn't working label Apr 12, 2019
@dahlia dahlia self-assigned this Apr 12, 2019
@codecov
Copy link

codecov bot commented Apr 12, 2019

Codecov Report

Merging #197 into master will decrease coverage by 2.97%.
The diff coverage is 93.75%.

@@            Coverage Diff             @@
##           master     #197      +/-   ##
==========================================
- Coverage   87.42%   84.44%   -2.98%     
==========================================
  Files          72       72              
  Lines        3292     3350      +58     
==========================================
- Hits         2878     2829      -49     
- Misses        414      521     +107
Impacted Files Coverage Δ
Libplanet/Store/BaseStore.cs 100% <ø> (ø) ⬆️
Libplanet/Store/FileStore.cs 94.38% <100%> (+0.41%) ⬆️
Libplanet/Blockchain/BlockChain.cs 99.21% <100%> (+0.04%) ⬆️
Libplanet/Store/BlockSet.cs 90.74% <85.18%> (-5.93%) ⬇️
Libplanet/Net/IceServer.cs 0% <0%> (-100%) ⬇️
Libplanet/Net/IceServerException.cs 0% <0%> (-100%) ⬇️
Libplanet/Net/NetworkStreamProxy.cs 0% <0%> (-80.77%) ⬇️
Libplanet/Net/Swarm.cs 82.55% <0%> (-5.48%) ⬇️

@dahlia dahlia changed the title Get states determine nonexistece early Make BlockChain<T>.GetStates() returns early for nonexistent addresses Apr 12, 2019
: Enumerable.Repeat((byte)0xff, Address.Size).ToArray()
);

Func<BitArray, BitArray, bool> isPossibilyIn = (addr, maskPat) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Func<BitArray, BitArray, bool> isPossibilyIn = (addr, maskPat) =>
Func<BitArray, BitArray, bool> isPossiblyIn = (addr, maskPat) =>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

if (requestedAddresses.SetEquals(states.Keys))
// A set of addresses that are definitely not existent.
ImmutableHashSet<Address> nonexistents = requestedAddresses
.Where(pair => !isPossibilyIn(pair.Value, mask))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Where(pair => !isPossibilyIn(pair.Value, mask))
.Where(pair => !isPossiblyIn(pair.Value, mask))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

offset is HashDigest<SHA256> hash &&
Store.GetAddressesMask(hash) is Address a
? a.ToByteArray()
: Enumerable.Repeat((byte)0xff, Address.Size).ToArray()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code can be precomputed, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, it definitely is a bug. I'm going to address this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this and amended the commit.

: Enumerable.Repeat((byte)0xff, Address.Size).ToArray()
);

Func<BitArray, BitArray, bool> isPossibilyIn = (addr, maskPat) =>
Copy link
Member

@longfin longfin Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use local function.

Suggested change
Func<BitArray, BitArray, bool> isPossibilyIn = (addr, maskPat) =>
bool IsPossiblyIn(BitArray addr, BitArray maskPat)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@@ -49,21 +51,57 @@ public override ICollection<Block<T>> Values
throw new KeyNotFoundException();
}

Trace.Assert(block.Hash == key);
Trace.Assert(block.Hash.Equals(key));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we prefer Equals() instead of ==?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rider warned me that HashDigest<SHA256> hadn't implemented this operator, so I made this change. AFAIK this operator is implemented through Uno.CodeGen though.

@dahlia dahlia force-pushed the get-states-determine-nonexistece-early branch 2 times, most recently from f71d987 to f928bfd Compare April 15, 2019 06:44
@dahlia
Copy link
Contributor Author

dahlia commented Apr 15, 2019

@longfin I turned that byte[] entirely filled with 0xff into a constant (kinda, it's static readonly). f928bfd

// (the store made in the older versions of Libplanet
// may lack mask files), make the mask to match to
// all possible addresses.
prevMask = Enumerable.Repeat((byte)0xff, Address.Size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also use WildcardMask here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WildcardMask is defined in the BlockChain<T> class and it's private. Making BlockSet<T> depends on BlockChain<T> also seems unnatural. So I'm going to define the same constant here too.

It's like we should define a distinct type like AddressMask in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, as BlockChain<T> already has depended on BlockSet<T> (though the inverse is not true), I'm going to move the WildcardMask constant to BlockSet<T> and make it internal instead of private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed and amended the commit: 2317044.

@dahlia dahlia force-pushed the get-states-determine-nonexistece-early branch from f928bfd to 2317044 Compare April 15, 2019 06:59
@dahlia
Copy link
Contributor Author

dahlia commented Apr 15, 2019

@earlbread @longfin Could you review this again?

@dahlia dahlia merged commit d138388 into planetarium:master Apr 15, 2019
limebell pushed a commit to limebell/libplanet that referenced this pull request Jul 7, 2021
…ntiation

Avoid needless PublicKey instantiation (which is time expensive)
OnedgeLee pushed a commit to OnedgeLee/libplanet that referenced this pull request Jan 31, 2023
…netarium#197)

* INTERNAL: update configmap-versions.yaml

* INTERNAL: update kustomization.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants