Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task txg_sync:19098 blocked for more than 120 seconds. #5692

Closed
krichter722 opened this issue Jan 29, 2017 · 13 comments
Closed

task txg_sync:19098 blocked for more than 120 seconds. #5692

krichter722 opened this issue Jan 29, 2017 · 13 comments
Labels
Type: Performance Performance improvement or performance problem

Comments

@krichter722
Copy link

krichter722 commented Jan 29, 2017

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 16.10
Linux Kernel 4.8.0-34-generic
Architecture amd64
ZFS Version 0.7.0-rc3_2_gbb7ffda
SPL Version 0.7.0-rc3

Describe the problem you're observing

After a while (10 to 30 minutes) of heavy read and write operations, the I/O of all processes changes between a state where I/O works slowly and doesn't proceed or proceeds in not noticeable portions for seconds up to several minutes. This results in freezes of GUI and console programs.

In iotop I see that read and write rates rarely exceed some 100 KB/s. If the system rests for several minutes with no heavy I/O requested suddenly the read and write rates to up to the usual ~100 MB/s and the delays disappear/the situation recovers. After causing a high load of read and write requests the system gets into the state and recovers with the same symptoms.

During the changes between slow and no proceeding in I/O I see > 100 z_fr_iss_[n] threads in iotop taking 100 % of I/O capacity. Since my HDD is quite loud (during intensive head movement) I can hear when the system start recovering. Is it really a good idea to have so many threads running, given the fact that they need to be managed and can be used for more than one task by exchanging thread stacks?

Describe how to reproduce the problem

Start 2 or more VirtualBox VMs for example.

Include any warning/errors/backtraces from the system logs

dmesg contains

[Jan29 11:13] INFO: task txg_sync:19098 blocked for more than 120 seconds.
[  +0,000004]       Tainted: P           OE   4.8.0-34-generic #36-Ubuntu
[  +0,000001] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0,000002] txg_sync        D ffff9072ed32bb48     0 19098      2 0x00000000
[  +0,000003]  ffff9072ed32bb48 00ff9072ed32bb90 ffff9073ace71d00 ffff9073136d9d00
[  +0,000002]  ffff90732598c1c0 ffff9072ed32c000 ffff9073bf259300 7fffffffffffffff
[  +0,000002]  ffff9071dd8b1b28 0000000000000001 ffff9072ed32bb60 ffffffffb4697b15
[  +0,000002] Call Trace:
[  +0,000005]  [<ffffffffb4697b15>] schedule+0x35/0x80
[  +0,000002]  [<ffffffffb469b2ca>] schedule_timeout+0x22a/0x3f0
[  +0,000046]  [<ffffffffc0ae0d50>] ? zio_reexecute+0x380/0x380 [zfs]
[  +0,000002]  [<ffffffffb3efcb21>] ? ktime_get+0x41/0xb0
[  +0,000002]  [<ffffffffb4697304>] io_schedule_timeout+0xa4/0x110
[  +0,000006]  [<ffffffffc08c1b32>] cv_wait_common+0xb2/0x130 [spl]
[  +0,000002]  [<ffffffffb3ec7560>] ? wake_atomic_t_function+0x60/0x60
[  +0,000004]  [<ffffffffc08c1c08>] __cv_wait_io+0x18/0x20 [spl]
[  +0,000035]  [<ffffffffc0ae38ec>] zio_wait+0xec/0x190 [zfs]
[  +0,000030]  [<ffffffffc0a825f3>] spa_sync+0x453/0xd70 [zfs]
[  +0,000032]  [<ffffffffc0a93894>] txg_sync_thread+0x2c4/0x480 [zfs]
[  +0,000032]  [<ffffffffc0a935d0>] ? txg_delay+0x160/0x160 [zfs]
[  +0,000004]  [<ffffffffc08bc6d1>] thread_generic_wrapper+0x71/0x80 [spl]
[  +0,000003]  [<ffffffffc08bc660>] ? __thread_exit+0x20/0x20 [spl]
[  +0,000002]  [<ffffffffb3ea4128>] kthread+0xd8/0xf0
[  +0,000002]  [<ffffffffb469c4df>] ret_from_fork+0x1f/0x40
[  +0,000001]  [<ffffffffb3ea4050>] ? kthread_create_on_node+0x1e0/0x1e0

logged every 2 minutes (+/- some ms).

I build spl and zfs from source.

I have a 1 partition pool with 1 cache partition associated.

@behlendorf behlendorf added the Type: Performance Performance improvement or performance problem label Jan 30, 2017
@behlendorf
Copy link
Contributor

@krichter722 thank you for the detailed issue. In general, the large number of z_fr_iss threads which get created allow ZFS to more efficiently handle the outstanding work. However, there may be cases like you're seeing which can be improved.

From the description you've provided it sounds like you've run in to the issue described in #5449. Internally ZFS is allowing too many free requests to be assigned to a single TXG. This effectively stalls the writes until all the frees are processed.

If you're comfortable with git and building ZFS from source I'd encourage you to try applying the patch in #5449 to verify it resolves the issue.

@avnik
Copy link

avnik commented Jan 31, 2017

@behlendorf can traces which I post to zfs-discuss couple days ago be related to same problem? http://list.zfsonlinux.org/pipermail/zfs-discuss/2017-January/027254.html

Replies in list hint me on hw issues, but it may be different symptoms of same problem also.

@behlendorf
Copy link
Contributor

Yes, it's possible hard to say for certain.

@krichter722
Copy link
Author

From the description you've provided it sounds like you've run in to the issue described in #5449. Internally ZFS is allowing too many free requests to be assigned to a single TXG. This effectively stalls the writes until all the frees are processed.

If you're comfortable with git and building ZFS from source I'd encourage you to try applying the patch in #5449 to verify it resolves the issue.

spl-0.7.0-rc3-2-g97048200 and zfs-0.7.0-rc3-57-g544b8053db don't exhibit this behaviour after a superficial investigation which would have cause the issue in the previously used version. Thank you.

@behlendorf
Copy link
Contributor

@krichter722 thanks for verifying the patch does address the issue. Then I'm going to close this issue.

@avnik
Copy link

avnik commented Feb 2, 2017

@behlendorf also fix my case possible.

(Although is was not very clean experiment, I upgrade kernel to 4.9.6 same time)

UPDATE: It behave more smoothly, but printk's still shows

@krichter722
Copy link
Author

krichter722 commented Feb 9, 2017

spl-0.7.0-rc3-2-g97048200 and zfs-0.7.0-rc3-57-g544b8053db don't exhibit this behaviour after a superficial investigation which would have cause the issue in the previously used version. Thank you.

So much for superficial investigation. The mentioned versions only delay the described behaviour, but under heavy load I still experience the same behaviour and see the same stacktrace in dmesg. I assume that ZFS is supposed to perform adequate even under heavy load (I'm exaggerating with 3 VMs, browser, mail client, video player and NetBeans IDE running in parallel, now) on a not overly sophisticated setup (one HDD partition with one SSD cache on one machine which already contradicts the recommendation to use entire devices), but FYI the issue persists.

I guess that ext4 would handle this well - maybe not with the same I/O rates and without any of ZFS's features, but it wouldn't starve I/O of processes that much.

@avnik did you ever see the issue again in the meantime?

@avnik
Copy link

avnik commented Feb 9, 2017

@krichter722 Also seen them, but fewer, and they stop softlocking system after I picked 0a252da Also I experience low linear read performance (is not enough for 720p video playback)

@behlendorf
Copy link
Contributor

It sounds as if under certain workload we're allowing to much I/O to be added to a transaction group. @krichter722 just so I understand, in spite of the stacks the system does recover after it clears the I/O backlog, correct? @avnik are you saying that after 0a252da performance has degraded?

@avnik
Copy link

avnik commented Feb 9, 2017

@behlendorf in my case performance on linear read (video playback) degraded after move from xfs on same hw.

0a252da improve situation which we discuss here, so big I/O backlog no more freeze whole system. (Sorry for not so clear previous message)

@behlendorf
Copy link
Contributor

@avnik thanks for the clarification. You might try setting the recordsize property to 1M to improve your sequential read performance for large files. Keep in mind you'll need to copy the file for it to make use of the larger block size.

@krichter722
Copy link
Author

@krichter722 just so I understand, in spite of the stacks the system does recover after it clears the I/O backlog, correct?

@behlendorf The system recovers always and I didn't experience any hard lockups. Some applications crash because they have some timeout error or experience other trouble (e.g. VirtualBox sends an ATA bus error the guest OS sets its filesystem read-only), but that's not ZFS's business.

I'm using zfs-0.7.0-rc3-81-g5f0517cadd now where the issue occurs less frequent as described above.

@krichter722
Copy link
Author

You might try setting the recordsize property to 1M to improve your sequential read performance for large files. Keep in mind you'll need to copy the file for it to make use of the larger block size.

Doesn't help in my case (in case that was directed to me as well) when using VirtualBox vdi images (after changing the property and then copying the files).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

3 participants