Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/engine: use SyncWAL instead of sync on commit #16942

Merged

Commits on Jul 10, 2017

  1. storage/engine: use SyncWAL instead of sync on commit

    The previous implementation of batching for synced commits had the
    unfortunate property that commits that did not require syncing were
    required to wait for the WAL to be synced. When RocksDB writes a batch
    with sync==true it internally does:
    
      1. Add batch to WAL
      2. Sync WAL
      3. Add entries to mem table
    
    Switch to using SyncWAL to explicitly sync the WAL after writing a
    batch. This is slightly different semantics from the above:
    
      1. Add batch to WAL
      2. Add entries to mem table
      3. Sync WAL
    
    The advantage of this new approach is that non-synced batches do not
    have to wait for the WAL to sync. Prior to this change, it was observed
    that essentially every batch was waiting for a WAL sync. Approximately
    half of all batch commits are performed with sync==true (half of all
    batch commits are for Raft log entries or Raft state). Forcing the
    non-synced commits to wait for the WAL added significant time.
    
    Reworked the implementation of batch grouping. The sequence number
    mechanism was replaced by per-batch sync.WaitGroup embedded in the
    rocksDBBatch structure. Once a batch is committed (and synced if
    requested), the wait group is signalled ensuring that only the desired
    goroutine is woken instead of waking all of the goroutines in the
    previous implementation. Syncing of the WAL is performed by a dedicated
    thread add only the batches which specify sync==true wait for the
    syncing to occur.
    
    Added "rocksdb.min_wal_sync_interval" which adds specifies a minimum
    delay to wait between calls to SyncWAL. The default of 1ms was
    experimentally determined to reduce disk IOs by ~50% (vs 0ms) while leaving write
    throughput unchanged.
    
    For a write-only workload (`ycsb -workload F`), write throughput
    improved by 15%. Raft command commit latency (unsynced commits) dropped
    from 30-40ms to 6-10ms.
    petermattis committed Jul 10, 2017
    Configuration menu
    Copy the full SHA
    82cbb49 View commit details
    Browse the repository at this point in the history