-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Private Blockchain Stops Syncing after upgrade to 2.4.x #10617
Comments
Could you share any |
@joshua-mir I will run these tests when I can. We had to downgrade to 2.3.9 to keep the chain running but I can definitely create a test box and run this. I will re-post when I do. |
@joshua-mir Okay here is some output. How much do you want?
This is a Parity 2.4.5 node on a chain with 6 peers of 2.3.9. Interesting fact. I accidentally upgraded the miner on this chain and it continued to mine new blocks but all other nodes were stuck on the same block 1217282 (same block the output above shows this miner is stuck on). I downgraded that miner. It is on block 1217303 but upgraded node that you see output on is still on 1217282. More output from that node without sync-trace:
|
@joshua-mir Did the information I posted help? Can I post anything else? |
Yeah, that's definitely helpful, thanks! The only peer that looks like it's one of yours has a genesis block mismatch - we'll need to look at whether we've changed anything about anything in our chainspecs between these versions, to my knowledge we haven't. Can you try resyncing some of these nodes if possible, and perhaps running the validators with a set of reserved peers (each other, and one or two other nodes) and |
Can you tell me what part of the log shows you that? I am still learning to make sense of the text.
What about this: #10214 It was on Parity 2.4.0. My private chain spec contains this line:
Could this be a conflict?
What do you mean by "validataors"? I can run some nodes by deleting all chain data and using only a few |
Looking through the logs again I was completely wrong about this - it's clear that peers: 0, 1, 2, 3 are the ones that you are actually looking for - from these short few seconds, I can't tell what might be wrong: At All that said, my previous guess as to your problem was probably incorrect. You claim:
That may possibly be the source of your issue, #10214 is a new transition that would allow you to disable 1283 on your chain, it would be
I was under the impression you were running a private network using the Aura engine? Miners. Whichever nodes have engine_signer configured and are in your validatorSet - the nodes creating the blocks that are going through your network - the pattern I am describing here is creating "sentry nodes" that insulate these nodes that are creating blocks from the rest of the network so block production is more reliable and less time is spent ignoring requests from other nodes. It's possible that this will end up solving your problem.
It would be useful to be on the latest stable branch, in case we find an issue. It would be more useful if you had sync logs from a moment the problem was actually occurring, but I understand running verbose logs for so long might be an issue - you might want to run |
No. I don't know what Aura engine is?
Yes. I confused a release note about |
I am re-opening this because I was wrong and it is not resolved.
EDIT I did not do this, I don't think. New nodes stop at a block that is higher than the block for I have turned off mining (months ago) to prevent a fork. What information can I provide to de-bug? |
@joshua-mir it seems that I did not cause the problem, but that moving workers to Parity 2.4.9 caused a fork in the chain with no errors or changes to the chain spec. What information can I provide to de-bug? |
@stone212 do you happen to know about the on-chain activity during the period between the time you disabled eip1283transition in your chainspec and the time you added it back in? The problems caused by a bad hardfork won't happen at the fork number that enabled/disabled an eip, but at the first transaction that behaves differently than expected that is part of the chain. I'm also assuming that you are using the ethash engine and actually mining with PoW on your private network here, as per this previous conversation, in which case hardforking your entire network (just make sure all of your nodes are on the same version and have the same chainspec and re-enable mining) may actually be the solution here, especially if transactions end up being replayed. |
Thank you for the reply.
First I want to be clear that this did not happen. The problem was with my reporting because we have two private blockchains and one of them had this in the spec, the other did not and so much time passed between my messages to this thread that I was confused when I replied to you. So actually I do not think at this time that I removed and then re-added anything to the chain spec (but I am not certain of that even because I was pulled away on other things). But I know the chain spec now has All nodes are using the same chain spec. Yes, PoW with ethash. Transactions? The network has two transaction on it ever and I do not remember when it happened. I can give you some information though. We had no workers for a long time (the reason I did not come back to this thread sooner) and then we got one. I put Parity 2.3.9 on it and it started mining blocks from the top block. Good! Then I downgraded one node at a time to 2.3.9, and if the node got stuck then I deleted One of the two transactions ever on this chain (a deployed token) seems to be working correctly. The other I do not know about but have asked a developer to verify and if he is as fast as usual he will reply in 2021. Then I downloaded Parity 2.4.9 onto one of these nodes and started it. It showed connected peers and then shows that it was syncing block X where block X is the next block in the chain after the upgrade, and it just stays there, stuck. It will not go past this. So basically the original error that casued me to open this Issue is here again. If I downgrade this same node back to 2.3.9 it syncs again. Parity 2.4 is the one that implements I actually don't understand what EIP1283 de-activation is really for? Maybe that will help me? Anyway it is only a guess that this is causing my Parity 2.4.9 node to stop syncing. Maybe it is something else about 2.4.x? |
@joshua-mir New issue #11133 answers all other issues I opened about this chain. They were not related to this directly I do not think. Now this issue is the primary one. I can de-bug any way you want. These are the list of EIP in the chain spec.
I do think this is a bug in Parity, or maybe a lack of documentation so there is something I do not see. It is always possible that it is something I read incorrectly too. |
@joshua-mir I tried this on a new blockchain and the error is the same. So it is not a problem of changing chain spec. I guess it must be that there is some incompatibility with Parity 2.4.x and the chain spec, maybe the list of EIP above. What information can I provide to de-bug? |
@joshua-mir This is still a major Issue that is still open. Private blockchain can not upgrade to Parity 2.4.x. Please request whatever de-bug information you think is helpful so we can resolve it. |
Your issue description goes here below. Try to include actual vs. expected behavior and steps to reproduce the issue.
A private blockchain with 6 nodes that functioned well stops syncing on nodes with no miners after upgrade these nodes to 2.4.5. Tested down to 2.4.0, the same error happens.
One worker uses Parity 2.1.9 (just today got access to it and will test upgrading it - but I would like to know why this is a problem if you think this is the problem).
After downgrading all parity clients to 2.3.8 the problem stops.
Here are things I tried and questions I have that might lead to answers
Removed
"eip1283Transition":
from theparams
section to avoid conflicts with Add EIP-1283 disable transition #10214. No change.Could the problem be that one worker (with the majority of hash) is using Parity 2.1.9? Specifically please tell me what would be the conflict (I am in process of testing this idea today as I just now got access to this worker - it is my best idea).
Could the problem be that some nodes use Parity 2.3.8 and others use 2.4.x?
I did look at the 2.4.0 Changelog (https://github.com/paritytech/parity-ethereum/blob/master/CHANGELOG.md) and I do not see any other problems but if there is a known incompatibility please tell me.
UPDATE: I tried Item 2 (upgraded strong worker to 2.4.5) and now the strong worker does not sync with the rest of the blockchain although all nodes are on 2.4.5 (updating Item 3 also).
The text was updated successfully, but these errors were encountered: