-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct lock ASSERTs in vdev_label_read/write #5983
Conversation
@ofaaland, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @dpquigl and @grwilson to be potential reviewers. |
I don't see how this would work.
|
@tuxoko , I'm not sure what the concern is, so if my answers below address the wrong things, let me know.
|
I see, I didn't expect that. |
I double-checked spa_misc.c, and However, perhaps I'm being too clever and should make the ASSERT check for either
So it's clearer to someone reading it what is going on, without making them look at spa_config_* in detail. Thoughts? |
The usage you've proposed in the existing PR already appears all over the code base so I don't think it will be too confusing. That said, I personally find your updated version easier to understand so let's go with that. |
The existing assertions in vdev_label_read() and vdev_label_write(), testing which config locks are held, are incorrect. The assertions test for locks which exceed what is required for safety. Both vdev_label_{read,write}() are changed to assert SCL_STATE is held as RW_READER or RW_WRITER. This is safe because: Changes to the vdev tree occur under SCL_ALL as RW_WRITER, via spa_vdev_enter() and spa_vdev_exit(). Changes to vdev state occur under SCL_STATE_ALL as RW_WRITER, via spa_vdev_state_enter() and spa_vdev_state_exit(). Therefore, the new assertions guarantee that the vdev cannot change out from under a zio, and I/O to a specified leaf vdev's label is safe. Furthermore, this is consistent with the SPA locking discussion in spa_misc.c, "For any zio operation that takes an explicit vdev_t argument ... zio_read_phys(), or zio_write_phys() ... SCL_STATE as reader suffices." Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
cae791b
to
082f524
Compare
ASSERT(spa_config_held(zio->io_spa, SCL_ALL, RW_WRITER) == SCL_ALL || | ||
(spa_config_held(zio->io_spa, SCL_CONFIG | SCL_STATE, RW_READER) == | ||
(SCL_CONFIG | SCL_STATE) && | ||
dsl_pool_sync_context(spa_get_dsl(zio->io_spa)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the dsl_pool_sync_context
intentionally removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
dsl_pool_sync_context() == B_TRUE in two cases; one is that the caller is the sync thread or a thread performing a sync task. Based on all vdev state and vdev tree changes occurring under RW_WRITER, doing I/O is safe under only RW_READER regardless of which thread is doing it.
The other case, is that spa.spa_is_initializing == B_TRUE. That field is only true during calls to dsl_pool_create() and dsl_pool_open(), within spa_create() and spa_load_impl() respectively, both of which are setting up facilities that aren't necessary for reading from or writing to the labels on leaf vdevs.
I believe this was simply copied from the code protecting the dirty state or config lists, where the list is altered by the sync thread with only the read lock held. In the context of protecting those lists, the restriction makes sense - you can guarantee there is only one writer, since there is only one sync thread at the point where those lists are modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tuxoko ,
Do you approve of the patch now? Or do you have concerns I haven't addressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't disapprove just have some questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thank you
Description
Both vdev_label_{read,write}() are changed to assert SCL_STATE is held
as RW_READER, instead of assertions which required more config locks be held.
Motivation and Context
The existing assertions in vdev_label_read() and vdev_label_write(),
testing which config locks are held, are incorrect. The assertions
test for locks which exceed what is required for safety.
This is safe because:
Changes to the vdev tree occur under SCL_ALL as RW_WRITER, via
spa_vdev_enter() and spa_vdev_exit().
Changes to vdev state occur under SCL_STATE_ALL as RW_WRITER, via
spa_vdev_state_enter() and spa_vdev_state_exit().
Therefore, the new assertions guarantee that the vdev cannot change
out from under a zio, and I/O to a specified leaf vdev's label is
safe.
Furthermore, this is consistent with the SPA locking discussion in
spa_misc.c, "For any zio operation that takes an explicit vdev_t
argument ... zio_read_phys(), or zio_write_phys() ... SCL_STATE as
reader suffices."
How Has This Been Tested?
Built ZFS with --enable-debug and ran zloop for about 2 hours.
Types of changes
Checklist:
Signed-off-by
.