Skip to content

Commit

Permalink
[IBD patch 4] Deprioritise slower downloaders (bitcoin#970)
Browse files Browse the repository at this point in the history
* Update the block availaibility for all nodes during IBD

One of the problems with IBD is that we end up downloading
blocks from the same peer. This is because after downloading
all the headers at startup the block availability is only
being updated for that peer. We need to assume that all connected
or newly connected peers have the blocks that we need but we
only do this during IBD or when IsInitialBlockDownload() is true.

* Increase the block download window to 1024

This helps to keep IBD moving forward when we have a peer
that may be serving us blocks more slowly.  And now that
blockvailability is being updated correctly we are getting blocks
from all connected peers so the chance of having a slow peer
is now much higher.

* Reset the single peer request mode age to 24 hours

One week isn't needed anymore because the new DAA is working
well and we're not falling behind in terms of blocks mined
per day.

* Remove the check for node ping in the request manager

This feature is causing occasional hangups during IBD and
is not really effective in selecting peers especially during
IBD. Since it is not used when the chain is sync'd then there
is no need for this feature to remain which only adds another
level of complexity to IBD. KISS can be applied here for a better
IBD experience.

* Only update xthin stats when IsChainNearlySyncd()

There is no need to lock vNodes and check all peers for
thinblocks in flight unless we are IsChainNearlySyncd() because
we would not have asked for any xthins if !IsChainNearlySyncd().

This is an expensive operation and which makes a performance hit
during IBD.

* Disconnect non NETWORK nodes during initial sync

We only want to have nodes we can download the full
blockchain from, connected to us while we do our initial sync.

* Increase the default max outbound connections to 16

Having a few more outbound connections can help especially
during the process of IBD.

* During IBD, batch the block reqeusts if possible.

By batching the blocks together we save a little bandwidth
and time in requesting blocks.

This method can then be applied to batching transactions which
should make a very good improvement during periods where transaction
rates are high.

* Replace missing cs_main lock

* fix formatting

* Fix locking order issue with cs_objectDownloader and cs_vNodes

cs_objectDownloader must be locked first to prevent a possible
deaklock.

* Fix formatting

* Use Logging namespace to access LogAcceptCategory

LogAcceptCategory and the enum used to define available logging
category is defined inside the Logging namespace. If used directly
outside of util.{h,cpp} reference to such namespace is needed.

This commit fix issue bitcoin#950

* Don't favor XTHIN nodes when sending messages

This has a negative effect on IBD

* Use more descriptive variable names for nodes we are disconnecting

ptemp1 and ptemp2 are getting hard to follow. We'll say what they are:

pNonXthinNode and pNonNodeNetwork

* Take the cs_objDownloader lock earlier when requesting multiple objects

This prevents anyone else from asking for these same objects before
we've notified the request manager of their existence. This could happen
for in stance during IBD where we look for the next blocks to download
but before we've notified the request manager of them all we then
go back and possibly request them again.

Also keeping the lock here allows the request manager to prepare
a better batch request of blocks and thereby keep a better order
of block requests.

* Call MarkBlocksAsInFlight() before asking for blocks

This ensures you don't receive any blocks back before you mark the block
in flight as could happen on regtest.

* Add/release node refs when making batch requests

We need to track the node refs correctly so we don't get disconnected
before we're done our work.

* Don't ask for the same blocks twice when doing IBD

If we've already asked for a block, we don't have to ask for it
again in FindNextBlocksToDownload(), instead we can rely o the request
manager handle potential re-requests.

* Calculate MaxBlocksInTransitPerPeer, but on an individual node basis

Before we were using the overall resonse times regardless of
which peer they came from. This made the selection of how many
blocks to donwload at one time not responsive to faster or
slower nodes.  By tracking response times on a per node basis
we can deprioritize slower downloaders and get more blocks
from the faster nodes.

In HEADERS net processing, remove MAX_BLOCKS_IN_TRANSIT_PER_PEER

  This has been replaced with pnode->nMaxBlocksInTransitPerPeer

use atomic store, load, fetch_add

* Fixes bitcoin#941: Rotate vNodes by one peer every 60 seconds when sending messages

We do this only during IBD and this has the effect of distributing
the load more evenly between the peers. Previously, because we are
using PV, the very first peer to connect would always be favored and
we would end up downloading a disproportionate amount of blocks from
that peer during IBD.  However, we still will download more from
some peers based on their download response time performance.

* Small net optimization when sending messages

Just copy the entire vector rather than bothering to allocate
memory and then push_back for each entry as we iteration through
vNodes.

Make sure to LOCK vNodes then add the ref

* Fix getchaintips.py spurious failures

The hang results from a  PV bug which is caused by a condition where
threads that are waiting for validation are forced to quit during chain
tip rollback.  We don't in fact have to force any threads to quit since
any validation threads on the current chain will fail automatically
once we disconnect the tip.  And then the waiting block validation
thread for the new chain can still continue without being inadvertanly
forced to quit.

* Do not to use an incorrect or invalid state

Use the node id from the iterator rather than looking up state
by the node we think we just passed in. Just in case they don't match
for some reason.

Also DbgAssert if the state is NULL and return false if we assert.

* Small edit to logprint

Add category BLK to logprint

remove printf

* During IBD only request blocks from peers that have block availability

Request from peers that are NODE_NETWORK and have demonstrated that
they have the blocks that we need.

Also move ProcessBlockAvailablity and UpdateBlockAvailabilty to
RequestManager.cpp

* Move FindNextBlocksToDownload() to the requestManager.cpp

Also change all NULL's to nullptr.

* Move MarkBlockAsInFlight() and MarkBlockAsReceived() to requestManager.cpp

This ia a Move only.  Alhtough there are edits in order to
bring these into the requestManager class structure there
are no logic changes.

C++ nullptr replaces NULL's

* Update block availablity during the process of downloading headers

During initial sync we have to download a set of all headers but
we also need to update the block availabiity for each connected
peer so that when the request manager starts downloading blocks
it can have the correct list of peers that have available blocks
to download from, rather than just assuming every peer has all
blocks.

* EXIT cs_objDownloader earlier.

We don't have to repeatedly take and release the locks.
Just do it once for all items in the entire batch request.

* Only ask for headers to updateblockavailability if chain work is behind

When doing the initial request for headers and we're in the process
of also updating the block availability for each connected peer we
only need to ask for a header if any peer is actually behind in
terms of chain work. This prevents first of all requesting
headers we don't need but also any possible attack where we're being
fed an invalid group of headers which then causes us to request
large number of additional headers.
  • Loading branch information
ptschip authored and gandrewstone committed Mar 2, 2018
1 parent dbe1288 commit f7cb6ff
Show file tree
Hide file tree
Showing 10 changed files with 632 additions and 447 deletions.
434 changes: 105 additions & 329 deletions src/main.cpp

Large diffs are not rendered by default.

9 changes: 0 additions & 9 deletions src/main.h
Original file line number Diff line number Diff line change
Expand Up @@ -105,12 +105,6 @@ static const unsigned int VERACK_TIMEOUT = 60;
/** Number of headers sent in one getheaders result. We rely on the assumption that if a peer sends
* less than this number, we reached its tip. Changing this value is a protocol upgrade. */
static const unsigned int MAX_HEADERS_RESULTS = 2000;
/** Size of the "block download window": how far ahead of our current height do we fetch?
* Larger windows tolerate larger download speed differences between peer, but increase the potential
* degree of disordering of blocks on disk (which make reindexing and in the future perhaps pruning
* harder). We'll probably want to make this a per-peer adaptive value at some point. */
// static const unsigned int BLOCK_DOWNLOAD_WINDOW = 1024;
/** Time to wait (in seconds) between writing blocks/block index to disk. */
static const unsigned int DATABASE_WRITE_INTERVAL = 60 * 60;
/** Time to wait (in seconds) between flushing chainstate to disk. */
static const unsigned int DATABASE_FLUSH_INTERVAL = 24 * 60 * 60;
Expand Down Expand Up @@ -435,9 +429,6 @@ bool TestLockPointValidity(const LockPoints *lp);
*/
bool CheckSequenceLocks(const CTransaction &tx, int flags, LockPoints *lp = NULL, bool useExistingLockPoints = false);

/** Update tracking information about which blocks a peer is assumed to have. */
void UpdateBlockAvailability(NodeId nodeid, const uint256 &hash);

/**
* Class that keeps track of number of signature operations
* and bytes hashed to compute signature hashes.
Expand Down
55 changes: 37 additions & 18 deletions src/net.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1745,7 +1745,8 @@ void ThreadOpenConnections()
int nOutbound = 0;
int nThinBlockCapable = 0;
set<vector<unsigned char> > setConnected;
CNode *ptemp = nullptr;
CNode *pNonXthinNode = nullptr;
CNode *pNonNodeNetwork = nullptr;
bool fDisconnected = false;
{
LOCK(cs_vNodes);
Expand All @@ -1759,7 +1760,11 @@ void ThreadOpenConnections()
if (pnode->ThinBlockCapable())
nThinBlockCapable++;
else
ptemp = pnode;
pNonXthinNode = pnode;

// If sync is not yet complete then disconnect any pruned outbound connections
if (IsInitialBlockDownload() && !(pnode->nServices & NODE_NETWORK))
pNonNodeNetwork = pnode;
}
}
// Disconnect a node that is not XTHIN capable if all outbound slots are full and we
Expand All @@ -1768,9 +1773,18 @@ void ThreadOpenConnections()
if (nOutbound >= nMaxOutConnections && nThinBlockCapable <= min(nMinXthinNodes, nMaxOutConnections) &&
nDisconnects < MAX_DISCONNECTS && IsThinBlocksEnabled() && IsChainNearlySyncd())
{
if (ptemp != nullptr)
if (pNonXthinNode != nullptr)
{
ptemp->fDisconnect = true;
pNonXthinNode->fDisconnect = true;
fDisconnected = true;
nDisconnects++;
}
}
else if (IsInitialBlockDownload())
{
if (pNonNodeNetwork != nullptr)
{
pNonNodeNetwork->fDisconnect = true;
fDisconnected = true;
nDisconnects++;
}
Expand All @@ -1793,8 +1807,12 @@ void ThreadOpenConnections()
MilliSleep(500);
{
LOCK(cs_vNodes);
if (find(vNodes.begin(), vNodes.end(), ptemp) == vNodes.end())
if (find(vNodes.begin(), vNodes.end(), pNonXthinNode) == vNodes.end() ||
find(vNodes.begin(), vNodes.end(), pNonNodeNetwork) == vNodes.end())
{
nOutbound--;
break;
}
}
}
}
Expand Down Expand Up @@ -2081,23 +2099,22 @@ void ThreadMessageHandler()
vector<CNode *> vNodesCopy;
{
LOCK(cs_vNodes);
vNodesCopy.reserve(vNodes.size());
// Prefer thinBlockCapable nodes when doing communications.
for (CNode *pnode : vNodes)

// During IBD and because of the multithreading of PV we end up favoring the first peer that
// connected and end up downloading a disproportionate amount of data from that first peer.
// By rotating vNodes evertime we send messages we can alleviate this problem.
// Rotate every 60 seconds so we don't do this too often.
static int64_t nLastRotation = GetTime();
if (IsInitialBlockDownload() && vNodes.size() > 0 && GetTime() - nLastRotation > 60)
{
if (pnode->ThinBlockCapable())
{
vNodesCopy.push_back(pnode);
pnode->AddRef();
}
std::rotate(vNodes.begin(), vNodes.end() - 1, vNodes.end());
nLastRotation = GetTime();
}

vNodesCopy = vNodes;
for (CNode *pnode : vNodes)
{
if (!pnode->ThinBlockCapable())
{
vNodesCopy.push_back(pnode);
pnode->AddRef();
}
pnode->AddRef();
}
}

Expand Down Expand Up @@ -2807,6 +2824,8 @@ CNode::CNode(SOCKET hSocketIn, const CAddress &addrIn, const std::string &addrNa
nXthinBloomfilterSize = 0;
addrFromPort = 0; // BU
nLocalThinBlockBytes = 0;
nAvgBlkResponseTime = -1.0;
nMaxBlocksInTransit = 16;

nMisbehavior = 0;
fShouldBan = false;
Expand Down
6 changes: 5 additions & 1 deletion src/net.h
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ static const size_t SETASKFOR_MAX_SZ = 2 * MAX_INV_SZ;
/** The maximum number of peer connections to maintain. */
static const unsigned int DEFAULT_MAX_PEER_CONNECTIONS = 125;
/** BU: The maximum numer of outbound peer connections */
static const unsigned int DEFAULT_MAX_OUTBOUND_CONNECTIONS = 12;
static const unsigned int DEFAULT_MAX_OUTBOUND_CONNECTIONS = 16;
/** BU: The minimum number of xthin nodes to connect */
static const uint8_t MIN_XTHIN_NODES = 8;
/** BU: The daily maximum disconnects while searching for xthin nodes to connect */
Expand Down Expand Up @@ -392,6 +392,10 @@ class CNode
uint32_t nXthinBloomfilterSize; // The maximum xthin bloom filter size (in bytes) that our peer will accept.
// BUIP010 Xtreme Thinblocks: end section

CCriticalSection cs_nAvgBlkResponseTime;
double nAvgBlkResponseTime;
std::atomic<int64_t> nMaxBlocksInTransit;

unsigned short addrFromPort;

protected:
Expand Down
9 changes: 5 additions & 4 deletions src/nodestate.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,17 @@
*/
CNodeState::CNodeState()
{
pindexBestKnownBlock = NULL;
pindexBestKnownBlock = nullptr;
hashLastUnknownBlock.SetNull();
pindexLastCommonBlock = NULL;
pindexBestHeaderSent = NULL;
pindexLastCommonBlock = nullptr;
pindexBestHeaderSent = nullptr;
fSyncStarted = false;
nDownloadingSince = 0;
nBlocksInFlight = 0;
nBlocksInFlightValidHeaders = 0;
fPreferredDownload = false;
fPreferHeaders = false;
fRequestedInitialBlockAvailability = false;
}

/**
Expand All @@ -33,6 +34,6 @@ CNodeState *State(NodeId pnode)
{
std::map<NodeId, CNodeState>::iterator it = mapNodeState.find(pnode);
if (it == mapNodeState.end())
return NULL;
return nullptr;
return &it->second;
}
3 changes: 3 additions & 0 deletions src/nodestate.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ struct CNodeState
bool fFirstHeadersReceived;
//! Our current block height at the time we requested GETHEADERS
int nFirstHeadersExpectedHeight;
//! During IBD we need to update the block availabiity for each peer. We do this by requesting a header
// when a peer connects and also when we ask for the initial set of all headers.
bool fRequestedInitialBlockAvailability;

std::list<QueuedBlock> vBlocksInFlight;
//! When the first entry in vBlocksInFlight started downloading. Don't care when vBlocksInFlight is empty.
Expand Down
Loading

0 comments on commit f7cb6ff

Please sign in to comment.