Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zbd/012: Test requeuing of zoned writes and queue freezing #154

Merged
merged 1 commit into from
Dec 16, 2024

Conversation

bvanassche
Copy link
Contributor

Test concurrent requeuing of zoned writes and request queue freezing. While this test passes with kernel 6.9, it triggers a hang with kernels 6.10..6.12. This shows that this hang is a regression introduced by the zone write plugging code.

sysrq: Show Blocked State
task:(udev-worker) state:D stack:0 pid:75392 tgid:75392 ppid:2178 flags:0x00000006
Call Trace:

__schedule+0x3e8/0x1410
schedule+0x27/0xf0
blk_mq_freeze_queue_wait+0x6f/0xa0
queue_attr_store+0x60/0xc0
kernfs_fop_write_iter+0x13e/0x1f0
vfs_write+0x25b/0x420
ksys_write+0x65/0xe0
do_syscall_64+0x82/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e

Test concurrent requeuing of zoned writes and request queue freezing. While
this test passes with kernel 6.9, it triggers a hang with kernels 6.10..6.12.
This shows that this hang is a regression introduced by the zone write
plugging code.

sysrq: Show Blocked State
task:(udev-worker)   state:D stack:0     pid:75392 tgid:75392 ppid:2178   flags:0x00000006
Call Trace:
 <TASK>
 __schedule+0x3e8/0x1410
 schedule+0x27/0xf0
 blk_mq_freeze_queue_wait+0x6f/0xa0
 queue_attr_store+0x60/0xc0
 kernfs_fop_write_iter+0x13e/0x1f0
 vfs_write+0x25b/0x420
 ksys_write+0x65/0xe0
 do_syscall_64+0x82/0x160
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
@bvanassche
Copy link
Contributor Author

@kawasaki @damien-lemoal Please take a look.

@damien-lemoal
Copy link
Contributor

Bart,

The zone write plugging fixes that I posted and that Jens applied should fix this. Please try again with these patches to check.

@bvanassche
Copy link
Contributor Author

Thanks for having taken a look Damien. I have already verified that your patches fix the reported issue. Since the kernel versions referred to above have been released and cannot be modified anymore, I think the references to kernel versions above are correct.

@damien-lemoal
Copy link
Contributor

Yes. The fixes are marked for stable so they will be backported to stable 6.12. 6.10 and 6.11 are already EOL so they will not get the backports.

@bvanassche
Copy link
Contributor Author

With "6.10..6.12" I want to refer to kernel versions 6.10.0, 6.11.0 and 6.12.0. In other words, I want to exclude later versions that include backports. Please let me know if you want me to include this clarification in the patch description.

@kawasaki
Copy link
Collaborator

@bvanassche Thanks for this PR. The code looks good to me. I will wait for the kernel side fix gets upstreamed (v6.13-rc3 or rc4, hopefully?) then merge this PR.

@bvanassche
Copy link
Contributor Author

I think that Linus just pulled Damien's kernel patches :-) See also https://lore.kernel.org/linux-block/d2acd3cb-188c-4ad5-91db-efbc6e50a1c1@kernel.dk/.

@kawasaki kawasaki merged commit 22f21d6 into osandov:master Dec 16, 2024
5 checks passed
@kawasaki
Copy link
Collaborator

Yes, the fix is now upstreamed and tagged with v6.13-rc3. I have merged this PR. Thanks!

@yizhanglinux
Copy link
Contributor

@kawasaki @bvanassche To run this case, seems we also need _io_uring_enable/_io_uring_restore for the fio tests.

@bvanassche
Copy link
Contributor Author

I will switch to libaio since io_uring does not preserve the write submission order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants