Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redwood chunked file growth and low priority IO starvation prevention #5936

Merged
merged 18 commits into from
Nov 12, 2021

Conversation

sfc-gh-satherton
Copy link
Collaborator

@sfc-gh-satherton sfc-gh-satherton commented Nov 9, 2021

Redwood now grows files in chunks of 20k pages by default, controlled by a knob.

To prevent high read rates from starving lower priority write ops, PriorityMultiLock now

  • accepts a limit of dispatched waiters from the same level before moving on to lower levels
  • maintains priority level being dispatched from across dispatch loop wakeups
  • stores queued time with waiters (not yet used) and prints more detailed debug info

Simulation bug fix: ioTimeoutError() and ioDegradedOrTimeoutError won't trigger until after speedUpSimulation is set since before then IO can be extremely slow in sim time.

This PR also includes #5925.

The general guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or master if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

… reduce truncate() calls for expansion. PriorityMultiLock has limit on consecutive same-priority lock release. Increased Redwood max priority level to 3 for more separation at higher BTree levels.
…ss the simulated process has been set to have an unreliable disk.
…esponse to writes, which wait on only the necessary truncate operations. Increased buggified chunk size because truncate can be very slow in simulation.
…o-priority

# Conflicts:
#	flow/genericactors.actor.h
…o-priority

# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
…ait until at least the target timeout interval past the point when simulation is sped up.
…o-priority

# Conflicts:
#	fdbclient/ServerKnobs.cpp
#	fdbclient/ServerKnobs.h
#	fdbserver/VersionedBTree.actor.cpp
… and do in place value updates if the new value is the same size as the old value.
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for macOS Catalina 10.15

  • CodeBuild project: foundationdb-pr-macos
  • Commit ID: d465560
  • Result: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for macOS Catalina 10.15

  • CodeBuild project: foundationdb-pr-macos
  • Commit ID: 9686ac7
  • Result: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: d465560
  • Result: FAILED
  • Error: Build has timed out.
  • Build Logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: 9686ac7
  • Result: FAILED
  • Error: Build has timed out.
  • Build Logs (available for 30 days)

…tchToLinearMerge() since it is only used in one place.
@sfc-gh-jslocum sfc-gh-jslocum self-assigned this Nov 9, 2021
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: 5994501
  • Result: FAILED
  • Error: Build has timed out.
  • Build Logs (available for 30 days)

1 similar comment
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: 5994501
  • Result: FAILED
  • Error: Build has timed out.
  • Build Logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: 5994501
  • Result: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)

@sfc-gh-satherton sfc-gh-satherton changed the base branch from master to release-7.0 November 10, 2021 16:24
@sfc-gh-satherton sfc-gh-satherton changed the base branch from release-7.0 to master November 10, 2021 16:24
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: 80707a0
  • Result: FAILED
  • Error: Error while executing command: cmake -S . -B ${BUILD_DIR} -D USE_CCACHE=ON -D USE_WERROR=ON -D RocksDB_ROOT=/opt/rocksdb-6.22.1 -D RUN_JUNIT_TESTS=ON -D RUN_JAVA_INTEGRATION_TESTS=ON -G Ninja. Reason: exit status 1
  • Build Logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for macOS Catalina 10.15

  • CodeBuild project: foundationdb-pr-macos
  • Commit ID: 80707a0
  • Result: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)

fdbserver/WorkerInterface.actor.h Show resolved Hide resolved
self->filePageCountPending = newPageCount;

// Wait for any previous extensions to complete
wait(self->fileExtension);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm my understanding, the reason this works along with line 2607 updating self->FileExtension to this future is because extendToCoverwill run up to the wait, including passing the current self->fileExtension to the wait, before updating self->fileExtension on line 2607?
This seems like a weird interaction with futures that might at least warrant a comment or something?

…n future as a parameter. Fixed initialization warning.
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for macOS Catalina 10.15

  • CodeBuild project: foundationdb-pr-macos
  • Commit ID: 751e36f
  • Result: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report for Linux CentOS 7

  • CodeBuild project: foundationdb-pr
  • Commit ID: 751e36f
  • Result: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants