Skip to content
This repository has been archived by the owner on Sep 26, 2019. It is now read-only.

[NC-1344] Create a simple WorldStateDownloader #657

Merged
merged 25 commits into from
Jan 26, 2019

Conversation

mbaxter
Copy link
Contributor

@mbaxter mbaxter commented Jan 24, 2019

PR description

This PR contains a basic algorithm for downloading world state from the network along with supporting changes. This is just the first step in implementing the WorldStateDownloader. Additional functionality will need to be implemented in subsequent PRs including:

  • A real queue implementation that handles serialization / deserialization of data persisted to disk
  • Cancellation functionality
  • Handling stalled downloads
  • Periodic commits of downloaded world state

@mbaxter mbaxter requested a review from ajsutton January 24, 2019 23:45
Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really promising. I suspect most of the things I've noted could be pushed to follow up PRs if you prefer. The only thing really worrying me is exposing StoredNode and using instanceof with it.

nodeData.add(BytesValue.EMPTY);
} else {
worldStateArchive.getNodeData(hash).ifPresent(nodeData::add);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check would probably be better inside WorldStateArchive so any future callers automatically benefit. Sorry, I should have done that in the first place...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this makes more sense in WorldStateArchive. I also added the optimization to the KeyValueStorageWorldStateStorage. Added some tests and cleaned up a few other things along the way, so probably worth looking through this commit: 03018c5

future.complete(null);
} else {
// Send out additional requests
requestNodeData();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like we're sending a single request for data at a time, processing it and then sending another one. I think we probably should be sending multiple requests at a time to speed up the download (with requests spread across multiple peers). Not required for this first PR but probably something to follow up on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requestNodeData has a while loop, so it should send out as many requests as it can whenever it runs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it does. I wonder if it's worth extracting a method for the send single request function to make that while a bit easier to spot (though sometimes there's just no helping fools like me...).

Line 128 above:
if (outstandingRequests.decrementAndGet() == 0 && pendingRequests.isEmpty()) {
will then depend on isEmpty() being thread safe and completely accurate. For the in memory implementation it would be enough to delegate to ConccurrentLinkedQueue.isEmpty() rather than using the separately tracked size variable I think. We know we're the last request to finish so there's nothing else coming in at the same time, but we need to be sure that any pending requests added by the request prior to us have actually been reflected in the pendingRequests properly.

(res, error) -> {
if (outstandingRequests.decrementAndGet() == 0 && pendingRequests.isEmpty()) {
// We're done
worldStateStorageUpdater.commit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to keep an eye on - if we want to be able to resume download after being interrupted we'll need to commit to storage more often. Otherwise everything would be lost if Pantheon was terminated while still downloading. Also worth testing how long RocksDB takes to commit if you happen to do it for the entire MainNet world state in one go...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - i've got a subtask for that

import java.util.Optional;

class StoredNode<V> implements Node<V> {
public class StoredNode<V> implements Node<V> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like an implementation detail that just leaked and really shouldn't have. I can see we wind up doing an instanceof in TrieNodeDataRequest but it's unclear exactly why it's specifically checking for this class or why isLoaded is a meaningful method to call. Maybe Node needs a new method specifically to address this situation like isReferencedByHash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, meant to clean this up. Added the method you suggested which seems to be the thing we really need.

public static StoredNodeFactory<BytesValue> create() {
return new StoredNodeFactory<>(
(h) -> Optional.empty(), Function.identity(), Function.identity());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fairly misleading - it doesn't really create a very useful StoredNodeFactory. Looks like we really just wanted to split out the decoding functionality to be separate from the retrieve and create functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - this is kind of messy. Created a TrieNodeDecoder helper class instead.


public class InMemoryBigQueue<T> implements BigQueue<T> {
private final AtomicLong size = new AtomicLong(0);
private final Queue<T> internalQueue = new ConcurrentLinkedQueue<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking these separately means this implementation is not really thread-safe (size and actual requests may be out of sync). size and isEmpty are always somewhat dangerous methods for a queue used concurrently because the state may change before you actually try to act on the result.

I think it's ok at the moment, but if we move to having multiple outstanding requests then we probably need to rethink how we use this and outstandingRequests back in WorldStateDownloader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the separate size tracking. I think that the way isEmpty is being used in the downloader should be okay because the flow is:

  • send request and increment outstandingRequest count
  • receive response and process it
    • queue additional requests
  • decrement outstandingRequests and check if the queue is empty

Assuming that once requests are queued (enqueue returns), an updated count should be visible to other threads, I think that flow works. Does that seem right to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the outstanding request count is incremented before the request is actually sent (which it was from memory) then yes I agree that works.

@@ -29,23 +32,46 @@ public KeyValueStorageWorldStateStorage(final KeyValueStorage keyValueStorage) {
}

@Override
public Optional<BytesValue> getCode(final Hash codeHash) {
return keyValueStorage.get(codeHash);
public Optional<BytesValue> getCode(final Bytes32 codeHash) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these methods use Hash and some use Bytes32 - just changed them all to Bytes32 since that is more generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also kind of annoying to have to wrap Bytes32 in Hash all over. Kind of wondering if we really need the Hash type ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Hash providing a very clear type to say this is a hash, not just an arbitrary 32 bytes but I don't think it has to be a hard and fast "every hash must use the Hash type" kind of thing. Particularly in this low level kind of place where the meaning is unambiguous using Bytes32 makes sense to me.

@@ -49,7 +49,7 @@
private final WorldStateStorage worldStateStorage;

public DefaultMutableWorldState(final WorldStateStorage storage) {
this(MerklePatriciaTrie.EMPTY_TRIE_ROOT_HASH, storage);
this(MerklePatriciaTrie.EMPTY_TRIE_NODE_HASH, storage);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing special about a root node, so updated this constant name so that it can be used more generally.

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.


private Optional<BytesValue> getValue(
final Bytes32 hash, final Function<Bytes32, Optional<BytesValue>> getter) {
return getTrieValue(hash, (h) -> getCodeValue(h, getter));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've wound up applying the check for empty twice in the getNodeData path - once in WorldStateArchive and once here. This is probably the better place, but I actually prefer the more explicit code from WorldStateArchive - I know it essentially duplicates some code but explicitly checking for EMPTY and EMPTY_TRIE_NODE_HASH is just a bit more readable than delegating through optionals.

I'm not sure it matters, but it also avoids creating an extra object instances that this lambda version requires (to create a closure over the getter param).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough - updated

@@ -29,23 +32,46 @@ public KeyValueStorageWorldStateStorage(final KeyValueStorage keyValueStorage) {
}

@Override
public Optional<BytesValue> getCode(final Hash codeHash) {
return keyValueStorage.get(codeHash);
public Optional<BytesValue> getCode(final Bytes32 codeHash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Hash providing a very clear type to say this is a hash, not just an arbitrary 32 bytes but I don't think it has to be a hard and fast "every hash must use the Hash type" kind of thing. Particularly in this low level kind of place where the meaning is unambiguous using Bytes32 makes sense to me.

future.complete(null);
} else {
// Send out additional requests
requestNodeData();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it does. I wonder if it's worth extracting a method for the send single request function to make that while a bit easier to spot (though sometimes there's just no helping fools like me...).

Line 128 above:
if (outstandingRequests.decrementAndGet() == 0 && pendingRequests.isEmpty()) {
will then depend on isEmpty() being thread safe and completely accurate. For the in memory implementation it would be enough to delegate to ConccurrentLinkedQueue.isEmpty() rather than using the separately tracked size variable I think. We know we're the last request to finish so there's nothing else coming in at the same time, but we need to be sure that any pending requests added by the request prior to us have actually been reflected in the pendingRequests properly.

@mbaxter mbaxter changed the title Nc 1344/download world state [NC-344] Create a simple WorldStateDownloader Jan 26, 2019
@mbaxter mbaxter changed the title [NC-344] Create a simple WorldStateDownloader [NC-1344] Create a simple WorldStateDownloader Jan 26, 2019
@mbaxter mbaxter merged commit dff436b into PegaSysEng:master Jan 26, 2019
rain-on pushed a commit to rain-on/pantheon that referenced this pull request Jan 29, 2019
vinistevam pushed a commit to vinistevam/pantheon that referenced this pull request Jan 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants