Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feats: the Implementation of Parallel EVM 2.0(v1.1.16 rebased) #39

Commits on Oct 12, 2022

  1. Parallel: Kick off for BEP-130: Parallel Transaction Execution.

    Add a new interface StateProcessor.ProcessParallel(...), it is a
    copy of Process(...) right now.
    This patch is a placeholder, we will implement BEP-130 based on it.
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    c1a1e9b View commit details
    Browse the repository at this point in the history
  2. Parallel: implement modules && workflow

    ** modules of init, slot executer and dispatcher
    BEP 130 parallel transaction execution will maintain a tx execution routine
    pool, a configured number of slot(routine) to execution transactions.
    
    Init is executed once on startup and will create the routine pool.
    Slot executer is the place to execute transactions.
    The dispacther is the module that will dispatch transaction to the right slot.
    
    ** workflow: Stage Apply, Conflict Detector, Slot, Gas...
      > two stages of applyTransaction
      For sequential execution, applyTransaction will do transaction execution  and
      result finalization.
    
      > Conflict detector
      We will check the parallel execution result for each transaction.
      If there is a confliction, the result can not be committed, redo will
      be scheduled to update its StateDB and re-run
      For parallel execution, the execution result may not be reliable(conflict), use
      try-rerun policy, the transaction could be executed more than once to get the correct result.
      Once the result is confirm, we will finalize it to StateDB.
      Balance, KV, Account Create&Suicide... will be checked
      And conflict window is important for conflict check.
    
      > Slot StateDB
      Each slot will have a StateDB to execute transaction in slot.
      The world state changes are stored in this StateDB and merged to the main StateDB
      when transaction result is confirmed. SlotState.slotdbChan is the current execute TX's slotDB.
      And only dirty state object are allowed to merge back, otherwise, there is a race condition
      of merge outdated stateobject back.
    
    ** others
    gas pool, transaction gas, gas fee reward to system address
    evm instance, receipt CumulativeGasUsed & Log Index,
    contract creation, slot state,
    parallel routine safety:
      1.only dispatcher can access main stateDB
      2.slotDB will be created and merged to stateDB in dispatch goroutine.
    
    ** workflow 2: CopyForSlot, redesign dispatch, slot StateDB reuse & several bugfix
    
      > simplifiy statedb copy with CopyForSlot
      only copy dirtied state objects
      delete prefetcher
    
    ** redesign dispatch, slot StateDB reuse...
      > dispatch enhance
      remove atomic idle, curExec... replace by pendingExec for slot.
    
      >slot StateDB reuse
      It will try to reuse the latest merged slotDB in the same slot.
      If reuse failed(conflict), it will try to update to the latest world state and redo.
      The reuse SlotDB will the same BaseTxIndex, since its world state was sync when it was created based on that txIndex
      Conflict check can skip current slot now.
      it is more aggressive to reuse SlotDB for idle dispatch
      not only pending Txs but also the idle dispatched Txs try to reuse SlotDB now.
    
    ** others
    state change no needs to store value
    
    add "--parallel" startup options
    Parallel is not enabled by default.
    To enable it, just add a simple flag to geth: --parallel
    To config parallel execute parameter: --parallel.num 20 --parallel.queuesize 30
    "--parallel.num" is the number of parallel slot to execute Tx, by default it is CPUNum-1
    "--parallel.queuesize" is the maxpending queue size for each slot, by default it is 10
    
    For example:
      ./build/bin/geth --parallel
      ./build/bin/geth --parallel --parallel.num 10
      ./build/bin/geth --parallel --parallel.num 20 --parallel.queuesize 30
    
    ** several BugFix
    1.system address balance conflict
      We take system address as a special address, since each transaction will
      pay gas fee to it.
      Parallel execution  reset its balance in slotDB, if a transaction try to access
      its balance, it will receive 0. If the contract needs the real system address
      balance, we will schedule a redo with real system address balance
    
      One transaction that accessed system address:
      https://bscscan.com/tx/0xcd69755be1d2f55af259441ff5ee2f312830b8539899e82488a21e85bc121a2a
    
    2.fork caused by address state changed and read in same block
    3.test case error
    4.statedb.Copy should initialize parallel elements
    5.do merge for snapshot
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    7b9c43b View commit details
    Browse the repository at this point in the history
  3. Parallel: more readable code & dispatch policy & Revert & UT

    ** move .Process() close to .ProcessParallel()
    ** InitParallelOnce & preExec & postExec for code maintenance
    ** MergedTxInfo -> SlotChangeList & debug conflict ratio
    ** use ParallelState to keep all parallel statedb states.
    ** enable queue to same slot
    
    ** discard state change of reverted transaction
       And debug log refine
    
    ** add ut for statedb
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    5e8cd4f View commit details
    Browse the repository at this point in the history
  4. Parallel: dispatch, queueSize, slot DB prefetch, disable cache prefet…

    …ch for parallel
    
    this patch has 3 changes:
    1.change default queuesize to 20, since 10 could be not enough and will cause more conflicts
    2.enable slot DB trie prefetch, use the prefetch of main state DB.
    3.disable transaction cache prefetch when parallel is enabled
      since in parallel mode CPU resource could be limitted, and paralle has its own piped transaction execution
    
    4.change dispatch policy
      ** queue based on from address
      ** queue based on to address, try next slot if current is full
      Since from address is used to make dispatch policy,
      the pending transactions in a slot could have several different
      To address, so we will compare the To address of every pending transactions.
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    fdd73da View commit details
    Browse the repository at this point in the history
  5. Parallel: implement COW(Copy-On-Write)

    ** use sync map for the stateObjects in parallel
    ** others
      fix a SlotDB reuse bug & enable it
      delete unnecessary parallel initialize for none slot DB.
    lunarblock authored and setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    c8be553 View commit details
    Browse the repository at this point in the history
  6. Parallel: several bugfixs: state merge, suicide fixup, conflict detec…

    …t, prefetch, fork
    
    This is a complicated patch, to do some fixup
    
    ** fix MergeSlotDB
    Since copy-on-write is used, transaction will do StateObject deepCopy before it writes the state;
    All the dirty state changed will be recorded in this copied one first, the ownership will be
    transfered to main StateDB on merge.
    It has a potential race condition that the simple ownership transfer may discard other state changes
    by other concurrent transactions.
    When copy-on-write is used, we should do StateObject merge.
    
    ** fix Suicide
    Suicide has an address state read operation.
    And it also needs do copy-on-write, to avoid damage main StateDB's state object.
    
    ** fix conflict detect
    If state read is not zero, should do conflict detect with addr state change first.
    Do conflict detect even with current slot, if we use copy-on-write and slotDB reuse, same
    slot could has race conditon of conflict.
    
    ** disable prefetch on slotDB
    trie prefetch should be started on main DB on Merge
    
    ** Add/Sub zero balance, Set State
    These are void operation, optimized to reduce conflict rate.
    Simple test show, conflict rate dropped from ~25% -> 12%
    
    **fix a fork on block 15,338,563
    It a nonce conflict caused by opcode: opCreate & opCreate2
    Generally, the nonce is advanced by 1 for the transaction sender;
    But opCreate & opCreate2 will try to create a new contract, the
    caller will advance its nonce too.
    
    It makes the nonce conflict detect more complicated: as nonce is a
    fundamental part of an account, as long as it has been changed, we mark
    the address as StateChanged, any concurrent access to it will be considered
    as conflicted.
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    fb25639 View commit details
    Browse the repository at this point in the history
  7. Parallel: conflict optimize, remove SlotDB reuse, trie prefetch

    ** optimize conflict for AddBalance(0)
    Add balance with 0 did nothing, but it will do an empty() check, and add
    a touch event. Add on transaction finalize, the touch event will check if
    the StateObject is empty, do empty delete if it is.
    
    This patch is to take the empty check as a state check, if the addr state has
    not been changed(create, suicide, empty delete), then empty check is reliable.
    
    ** optimize conflict for system address
    
    ** some code improvement & lint fixup & refactor for params
    
    ** remove reuse SlotDB
    Reuse SlotDB was added to reduce copy of StateObject, in order to mitigate
    the Go GC problem.
    And COW(Copy-On-Write) is used to address the GC problem too. With COW enabled,
    reuse can be removed as it has limitted benefits now and add more complexity.
    
    ** fix trie prefetch on dispatcher
    
    Trie prefetch will be scheduled on object finalize.
    With parallel, we should schedule trie prefetch on dispatcher, since
    the TriePrefetcher is not safe for concurrent access and it is created & stopped
    on dispatcher routine.
    
    But object.finalize on slot cleared its dirtyStorage, which broken the later trie
    prefetch on dispatcher when do MergeSlotDB.
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    a112279 View commit details
    Browse the repository at this point in the history
  8. Parallel: handle fixup & code review & enhancement

    No fundamental change, some improvements, include:
    ** Add a new type ParallelStateProcessor;
    ** move Parallel Config to BlockChain
    ** more precious ParallelNum set
    ** Add EnableParallelProcessor()
    ** remove panic()
    ** remove useless: redo flag,
    ** change waitChan from `chan int` to `chan struct {}` and communicate by close()
    ** dispatch policy: queue `from` ahead of `to`
    ** pre-allocate allLogs
    ** disable parallel processor is snapshot is not enabled
    ** others: rename...
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    9eef3ec View commit details
    Browse the repository at this point in the history
  9. feature: the Implementaion of Parallel EVM 2.0

    1.features of 2.0:
      ** Streaming Pipeline
      ** Implement universal unconfirmed state db reference, try best to get account object state.
      ** New conflict detect, check based on what it has read.
      ** Do parallel KV conflict check for large KV read
      ** new Interface StateDBer and ParallelStateDB
      ** shared memory pool for parallel objects
      ** use map in sequential mode and sync.map in parallel mode for concurrent StateObject access
      ** replace DeepCopy by LightCopy to avoid redundant memory copy of StateObject
      ** do trie prefetch in advance
      ** dispatcher 2.0
         Static Dispatch & Dynamic Dispatch
         Stolen mode for TxReq when a slot finished its static dispatched tasks
         RealTime result confirm in Stage2, when most if the tx have been executed at least once
         Make it configurable
    
    2.Handle of corner case:
     ** don't panic if there is anything wrong reading state
     ** handle system address, skip its balance check
     ** handle WBNB contract to reduce conflict rate by balance make up
        WBNB balance makeup by GetBalanceOpCode & depth
        add a lock to fix WBNB make up concurrent crash
        add a new interface GetBalanceOpCode
    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    ba4f4b9 View commit details
    Browse the repository at this point in the history
  10. improve: some code prune

    setunapo committed Oct 12, 2022
    Configuration menu
    Copy the full SHA
    83a1c35 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    c4b3b61 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    0aed943 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    e7938e7 View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2022

  1. fix: deadlock hang on block 20926591

    it is obvious deadlock typo on corner case, when create object on destructed address
    setunapo committed Oct 17, 2022
    Configuration menu
    Copy the full SHA
    35270fc View commit details
    Browse the repository at this point in the history