-
Notifications
You must be signed in to change notification settings - Fork 36.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable lock contention logging in checkqueue_tests #23223
Disable lock contention logging in checkqueue_tests #23223
Conversation
This is the resulting log file with this pull. It is the same modulo no contention logging. log
|
You've always been able to run with logging? #22736 & #22904 just changed the logging to be unconditional, rather than needing to compile with
Concept NACK. Wouldn't it be better to either further debug/fix the actual issue, or revert the broken change(s), rather than just ignoring the problem by turning off the logging? There has already been one comment in regards to these changes (#22736 & #22904) being: |
the checkqueue tests are designed to be hevily contested, since we're hoping to trigger any race conditions or other issues, I don't think that if we're logging normal operation contentions we can really 'fix' the tests without making the tests less likely to uncover bugs. maybe we can move checkqueue tests to fuzzing since it's a bit more fuzz like? |
Right, in which case it didn't make sense to log contentions in the checkqueue tests before, either. Logging contentions by default in the tests only revealed the issue, which was the goal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the large log file can be reproduced on prior commits (e.g. v22.0) with DDEBUG_LOCKCONTENTION and DEBUG_LOG_OUT?
Just finished running these steps:
They show that (for me) the issue was pre-existing.
Edit: building v22.0 to test. |
as some of these tests are designed to be heavily contested to trigger race conditions or other issues. This created very large log files when run with DEBUG_LOCKCONTENTION defined (up to v22) or with lock logging enabled by default in current master. Examples running the following command: ./src/test/test_bitcoin -t checkqueue_tests/test_CheckQueue_Correct_Random -- DEBUG_LOG_OUT > testlog.txt -rw-r--r-- 87042178 Oct 8 12:41 testlog-with-DEBUG_LOCKCONTENTION-at-v22-run1.txt -rw-r--r-- 73879896 Oct 8 12:42 testlog-with-DEBUG_LOCKCONTENTION-at-v22-run2.txt -rw-r--r-- 65150518 Oct 8 12:51 testlog-with-DEBUG_LOCKCONTENTION-at-bb9f76a-run1.txt -rw-r--r-- 65774554 Oct 8 12:52 testlog-with-DEBUG_LOCKCONTENTION-at-bb9f76a-run2.txt -rw-r--r-- 73493309 Oct 8 13:00 testlog-current-master-at-991753e-run1.txt -rw-r--r-- 65616977 Oct 8 13:01 testlog-current-master-at-991753e-run2.txt -rw-r--r-- 5093 Oct 8 13:04 testlog-with-this-commit-run1.txt -rw-r--r-- 5093 Oct 8 13:05 testlog-with-this-commit-run2.txt
c48dc17
to
6ae9f1c
Compare
Updated per @MarcoFalke's suggestion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 6ae9f1c
Some testing results in #23167 (comment)
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov) c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov) Pull request description: This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`. From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one): > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. Related to: - #23167 - #23223 ACKs for top commit: martinus: ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in #23403. vasild: ACK 459e208 theStack: Code-review ACK 459e208 Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov) c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov) Pull request description: This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`. From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one): > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. Related to: - bitcoin#23167 - bitcoin#23223 ACKs for top commit: martinus: ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403. vasild: ACK 459e208 theStack: Code-review ACK 459e208 Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov) c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov) Pull request description: This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`. From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one): > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. Related to: - bitcoin#23167 - bitcoin#23223 ACKs for top commit: martinus: ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403. vasild: ACK 459e208 theStack: Code-review ACK 459e208 Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov) c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov) Pull request description: This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`. From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one): > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. Related to: - bitcoin#23167 - bitcoin#23223 ACKs for top commit: martinus: ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403. vasild: ACK 459e208 theStack: Code-review ACK 459e208 Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov) c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov) Pull request description: This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`. From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one): > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. Related to: - bitcoin#23167 - bitcoin#23223 ACKs for top commit: martinus: ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403. vasild: ACK 459e208 theStack: Code-review ACK 459e208 Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
…or directive 4394733 Add DEBUG_LOCKCONTENTION documentation to the developer notes (Jon Atack) 39a34b6 Put lock logging behind DEBUG_LOCKCONTENTION preprocessor directive (Jon Atack) Pull request description: This is a more minimal, no-frills version of #24734 for backport. The other fixes and improvements in that pull can be done after. *Copy of the PR 24734 description:* PRs #22736, #22904 and #23223 changed lock contention logging from a `DEBUG_LOCKCONTENTION` compile-time preprocessor directive to a runtime `lock` log category and improved the logging output. This changed the locking from using `lock()` to `try_lock()`: - `void Mutex::UniqueLock::lock()` acquires the mutex and blocks until it gains access to it - `bool Mutex::UniqueLock::try_lock()` doesn't block but instead immediately returns whether it acquired the mutex; it may be used by `lock()` internally as part of the deadlock-avoidance algorithm In theory the cost of `try_lock` might be essentially the [same](https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-31.html#l-697) relative to `lock`. The test-and-set logic of these calls is purported to be ~ constant time, optimised and light/quick if used carefully (i.e. no mutex convoying), compared to system calls, memory/cache coherency and fences, wait queues, and (particularly) lock contentions. See the discussion around #22736 (comment) and after with respect to performance/cost aspects. However, there are reasonable concerns (see [here](#22736 (comment)) and [here](https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-31.html#l-620)) that `Base::try_lock()` may be potentially [costly](https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-31.html#l-700) or [risky](#22904 (comment)) compared to `Base::lock()` in this very frequently called code. One alternative to keep the run-time lock logging would be to gate the `try_lock` call behind the logging conditional, for example as proposed in ccd73de and ACKed [here](#22736 (comment)). However, this would add the [cost](#22736 (comment)) of `if (LogAcceptCategory(BCLog::LOCK))` to the hotspot, instead of replacing `lock` with `try_lock`, for the most frequent happy path (non-contention). It turns out we can keep the advantages of the runtime lock contention logging (the ability to turn it on/off at runtime) while out of prudence putting the `try_lock()` call and `lock` logging category behind a `DEBUG_LOCKCONTENTION` compile-time preprocessor directive, and also still retain the lock logging enhancements of the mentioned PRs, as suggested in #24734 (comment) by W. J. van der Laan, in #22736 (comment), and in the linked IRC discussion. Proposed here and for backport to v23. ACKs for top commit: laanwj: Code review ACK 4394733 Tree-SHA512: 89b1271cae1dca0eb251914b1a60fc5b68320aab4a3939c57eec3a33a3c8f01688f05d95dfc31f91d71a6ed80cfe2d67b77ff14742611cc206175e47b2e5d3b1
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov) c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov) Pull request description: This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`. From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one): > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. Related to: - bitcoin#23167 - bitcoin#23223 ACKs for top commit: martinus: ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403. vasild: ACK 459e208 theStack: Code-review ACK 459e208 Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
Summary: This diff: - adds a LOCK logging category if DEBUG_LOCKCONTENTION is defined at compilation. - adds a timing macro in microseconds, `LOG_TIME_MICROS_WITH_CATEGORY` - updates `BCLog::LogMsg()` to omit irrelevant decimals for microseconds and skip unneeded code and math. - improves the lock contention logging, drops the all-caps, and displays the duration in microseconds - makes unexpected time type in `BCLog::LogMsg()` a compile-time error - add a paragraph of documentation to developer-notes.md Lock contention logging in checkqueue_tests is disabled, as some of these tests are designed to be heavily contested to trigger race conditions or other issues. This created very large log files when run with DEBUG_LOCKCONTENTION defined. This is a backport of [[bitcoin/bitcoin#22736 | core#22736]], [[bitcoin/bitcoin#22904 | core#22904]], [[bitcoin/bitcoin#23223 | core#23223]] and [[bitcoin/bitcoin#24770 | core#24770]] The first pull request implements the lock category, the other pull requests are important follow-ups (mixed with minor refactorings) fixing various regressions (bugs and suspected performance issues). Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com> Co-authored-by: practicalswift <practicalswift@users.noreply.github.com> Co-authored-by: Martin Ankerl <martin.ankerl@gmail.com> Test Plan: `ninja all check-all` Follow the instructions in the new documentation and check that the logging works: ``` cmake .. -GNinja -DCMAKE_CXX_FLAGS="-DDEBUG_LOCKCONTENTION" ninja src/bitcoind -h | grep -A8 "debug=<category>" src/bitcoind -debug=lock ``` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Subscribers: sdulfari Differential Revision: https://reviews.bitcoinabc.org/D12049
This patch disables lock contention logging in the checkqueue_tests as some of these tests are designed to be heavily contested to trigger race conditions or other issues. This created very large log files when run with DEBUG_LOCKCONTENTION defined (up to v22) or with lock logging enabled by default in current master.
Examples running the following command:
Resolves #23167.