Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack Trace When Attempting Send | Receive Task Between Two ZoL Hosts #15669

Closed
minorsatellite opened this issue Dec 14, 2023 · 5 comments
Closed
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@minorsatellite
Copy link

minorsatellite commented Dec 14, 2023

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 22.04.2 LTS
Kernel Version 5.15.0-91-generic
Architecture x64
OpenZFS Version zfs-2.1.5

Describe the problem you're observing

I am attempting to perform a simple send | receive job between two ZoL systems. The sending side is using:

zfs-2.1.6-1
zfs-kmod-2.1.6-1

The replication job starts out fine, but eventually results in a stack trace, depending on the transport used (SSH, netcat). The stack trace happens quicker (within seconds) when using netcat.

Describe how to reproduce the problem

Repeat the remote replication attempt.

Include any warning/errors/backtraces from the system logs

[ 1844.021258] VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17)
[ 1844.021678] PANIC at dsl_dir.c:951:dsl_dir_create_sync()
[ 1844.021859] Showing stack for process 7413
[ 1844.021862] CPU: 40 PID: 7413 Comm: txg_sync Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 1844.021865] Hardware name: Dell Inc. PowerEdge R7425/08V001, BIOS 1.20.0 05/03/2023
[ 1844.021867] Call Trace:
[ 1844.021869]
[ 1844.021874] show_stack+0x52/0x5c
[ 1844.021880] dump_stack_lvl+0x4a/0x63
[ 1844.021888] dump_stack+0x10/0x16
[ 1844.021892] spl_dumpstack+0x29/0x2f [spl]
[ 1844.021905] spl_panic+0xd1/0xe9 [spl]
[ 1844.021916] ? dmu_buf_rele+0xe/0x20 [zfs]
[ 1844.022059] ? zap_unlockdir+0x46/0x60 [zfs]
[ 1844.022183] ? zap_add_impl+0x96/0x160 [zfs]
[ 1844.022306] ? zap_add+0x7b/0xb0 [zfs]
[ 1844.022428] dsl_dir_create_sync+0x1ff/0x280 [zfs]
[ 1844.022531] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 1844.022541] dsl_dataset_create_sync+0x52/0x380 [zfs]
[ 1844.022641] dmu_recv_begin_sync+0x374/0xa00 [zfs]
[ 1844.022735] ? spa_get_slop_space+0x6e/0xc0 [zfs]
[ 1844.022852] ? __cond_resched+0x1a/0x50
[ 1844.022857] dsl_sync_task_sync+0xb9/0x110 [zfs]
[ 1844.022963] dsl_pool_sync+0x369/0x400 [zfs]
[ 1844.023068] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
[ 1844.023182] spa_sync+0x2dc/0x5b0 [zfs]
[ 1844.023300] txg_sync_thread+0x266/0x2f0 [zfs]
[ 1844.023420] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 1844.023539] thread_generic_wrapper+0x64/0x80 [spl]
[ 1844.023549] ? __thread_exit+0x20/0x20 [spl]
[ 1844.023558] kthread+0x12a/0x150
[ 1844.023564] ? set_kthread_struct+0x50/0x50
[ 1844.023567] ret_from_fork+0x22/0x30
[ 1844.023572]
[ 2056.123013] INFO: task txg_sync:7413 blocked for more than 120 seconds.
[ 2056.123542] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2056.123970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2056.124390] task:txg_sync state:D stack: 0 pid: 7413 ppid: 2 flags:0x00004000
[ 2056.124401] Call Trace:
[ 2056.124406]
[ 2056.124414] __schedule+0x24e/0x590
[ 2056.124434] schedule+0x69/0x110
[ 2056.124445] spl_panic+0xe7/0xe9 [spl]
[ 2056.124473] ? dmu_buf_rele+0xe/0x20 [zfs]
[ 2056.124691] ? zap_unlockdir+0x46/0x60 [zfs]
[ 2056.124973] ? zap_add_impl+0x96/0x160 [zfs]
[ 2056.125249] ? zap_add+0x7b/0xb0 [zfs]
[ 2056.125525] dsl_dir_create_sync+0x1ff/0x280 [zfs]
[ 2056.125759] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2056.125780] dsl_dataset_create_sync+0x52/0x380 [zfs]
[ 2056.126008] dmu_recv_begin_sync+0x374/0xa00 [zfs]
[ 2056.126218] ? spa_get_slop_space+0x6e/0xc0 [zfs]
[ 2056.126483] ? __cond_resched+0x1a/0x50
[ 2056.126492] dsl_sync_task_sync+0xb9/0x110 [zfs]
[ 2056.126731] dsl_pool_sync+0x369/0x400 [zfs]
[ 2056.126991] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
[ 2056.127252] spa_sync+0x2dc/0x5b0 [zfs]
[ 2056.127510] txg_sync_thread+0x266/0x2f0 [zfs]
[ 2056.127781] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 2056.128050] thread_generic_wrapper+0x64/0x80 [spl]
[ 2056.128074] ? __thread_exit+0x20/0x20 [spl]
[ 2056.128097] kthread+0x12a/0x150
[ 2056.128108] ? set_kthread_struct+0x50/0x50
[ 2056.128116] ret_from_fork+0x22/0x30
[ 2056.128129]
[ 2056.128174] INFO: task zfs:3276829 blocked for more than 120 seconds.
[ 2056.128614] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2056.129060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2056.129521] task:zfs state:D stack: 0 pid:3276829 ppid: 1 flags:0x00004002
[ 2056.129529] Call Trace:
[ 2056.129532]
[ 2056.129535] __schedule+0x24e/0x590
[ 2056.129542] ? __wake_up_common_lock+0x8a/0xc0
[ 2056.129552] schedule+0x69/0x110
[ 2056.129558] io_schedule+0x46/0x80
[ 2056.129564] cv_wait_common+0xab/0x130 [spl]
[ 2056.129582] ? wait_woken+0x70/0x70
[ 2056.129589] __cv_wait_io+0x18/0x20 [spl]
[ 2056.129607] txg_wait_synced_impl+0x9b/0x120 [zfs]
[ 2056.129875] txg_wait_synced+0x10/0x50 [zfs]
[ 2056.130142] dsl_sync_task_common+0x1c6/0x2a0 [zfs]
[ 2056.130380] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
[ 2056.130589] ? recv_check_large_blocks+0x60/0x60 [zfs]
[ 2056.130813] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
[ 2056.131025] ? recv_check_large_blocks+0x60/0x60 [zfs]
[ 2056.131235] dsl_sync_task+0x1a/0x20 [zfs]
[ 2056.131474] dmu_recv_begin+0x1e2/0x390 [zfs]
[ 2056.131685] zfs_ioc_recv_impl.constprop.0+0x106/0xb20 [zfs]
[ 2056.131964] ? arc_space_return+0x97/0x150 [zfs]
[ 2056.132159] zfs_ioc_recv+0x1b6/0x360 [zfs]
[ 2056.132444] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2056.132465] ? spl_kmem_free+0xe/0x20 [spl]
[ 2056.132486] ? __kmalloc_node+0x166/0x3a0
[ 2056.132496] ? spa_close+0x15/0x20 [zfs]
[ 2056.132760] ? spl_kmem_alloc_impl+0x80/0xd0 [spl]
[ 2056.132782] zfsdev_ioctl_common+0x686/0x740 [zfs]
[ 2056.133057] ? __check_object_size.part.0+0x4a/0x150
[ 2056.133069] ? _copy_from_user+0x31/0x70
[ 2056.133080] zfsdev_ioctl+0x57/0xf0 [zfs]
[ 2056.133345] __x64_sys_ioctl+0x95/0xd0
[ 2056.133355] do_syscall_64+0x5c/0xc0
[ 2056.133363] ? exit_to_user_mode_prepare+0x37/0xb0
[ 2056.133372] ? irqentry_exit_to_user_mode+0x17/0x20
[ 2056.133378] ? irqentry_exit+0x1d/0x30
[ 2056.133384] ? exc_page_fault+0x89/0x170
[ 2056.133390] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 2056.133399] RIP: 0033:0x7f35105a775f
[ 2056.133405] RSP: 002b:00007ffdf35b57e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2056.133411] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f35105a775f
[ 2056.133416] RDX: 00007ffdf35b69e0 RSI: 0000000000005a1b RDI: 0000000000000005
[ 2056.133419] RBP: 00007ffdf35b9fd0 R08: 0000000000000000 R09: 000055d7ec5350a0
[ 2056.133422] R10: 00007f35106a6d30 R11: 0000000000000246 R12: 00007ffdf35b69e0
[ 2056.133425] R13: 0000000000000000 R14: 00007ffdf35b59e0 R15: 00007ffdf35bb880
[ 2056.133431]
[ 2056.133446] INFO: task zpool:3277150 blocked for more than 120 seconds.
[ 2056.133919] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2056.134404] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2056.134931] task:zpool state:D stack: 0 pid:3277150 ppid:309265 flags:0x00000002
[ 2056.134938] Call Trace:
[ 2056.134940]
[ 2056.134944] __schedule+0x24e/0x590
[ 2056.134951] schedule+0x69/0x110
[ 2056.134958] cv_wait_common+0xf8/0x130 [spl]
[ 2056.134978] ? wait_woken+0x70/0x70
[ 2056.134987] __cv_wait+0x15/0x20 [spl]
[ 2056.135008] rrw_enter_read_impl+0x61/0x120 [zfs]
[ 2056.135266] rrw_enter_read+0x13/0x20 [zfs]
[ 2056.135519] rrw_enter+0x21/0x30 [zfs]
[ 2056.135770] dsl_pool_config_enter+0x1d/0x30 [zfs]
[ 2056.136004] spa_prop_get+0x9c/0x3c0 [zfs]
[ 2056.136260] ? kernel_init_free_pages.part.0+0x4a/0x70
[ 2056.136270] ? release_pages+0x170/0x510
[ 2056.136280] ? spa_name_compare+0xe/0x30 [zfs]
[ 2056.136547] ? avl_find+0x5b/0x90 [zavl]
[ 2056.136558] ? do_raw_spin_unlock+0x9/0x10 [zfs]
[ 2056.136812] ? __raw_spin_unlock+0x9/0x10 [zfs]
[ 2056.137066] ? spa_open_common+0x1c6/0x490 [zfs]
[ 2056.137324] ? __kmalloc_node+0x166/0x3a0
[ 2056.137332] zfs_ioc_pool_get_props+0x123/0x140 [zfs]
[ 2056.137610] zfsdev_ioctl_common+0x686/0x740 [zfs]
[ 2056.137884] ? __check_object_size.part.0+0x4a/0x150
[ 2056.137893] ? _copy_from_user+0x31/0x70
[ 2056.137901] zfsdev_ioctl+0x57/0xf0 [zfs]
[ 2056.138165] __x64_sys_ioctl+0x95/0xd0
[ 2056.138173] do_syscall_64+0x5c/0xc0
[ 2056.138179] ? do_user_addr_fault+0x1e7/0x670
[ 2056.138189] ? do_syscall_64+0x69/0xc0
[ 2056.138193] ? exit_to_user_mode_prepare+0x37/0xb0
[ 2056.138200] ? irqentry_exit_to_user_mode+0x17/0x20
[ 2056.138207] ? irqentry_exit+0x1d/0x30
[ 2056.138213] ? exc_page_fault+0x89/0x170
[ 2056.138219] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 2056.138226] RIP: 0033:0x7f3b87dd375f
[ 2056.138230] RSP: 002b:00007ffeae884fc0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2056.138236] RAX: ffffffffffffffda RBX: 000055a452e82430 RCX: 00007f3b87dd375f
[ 2056.138239] RDX: 00007ffeae885020 RSI: 0000000000005a27 RDI: 0000000000000003
[ 2056.138242] RBP: 00007ffeae888600 R08: 0000000000000000 R09: 000055a452f02d00
[ 2056.138245] R10: 000055a452f1f000 R11: 0000000000000246 R12: 00007ffeae885020
[ 2056.138248] R13: 000055a452e72320 R14: 0000000000001000 R15: 000055a452e82430
[ 2056.138253]
[ 2176.957044] INFO: task txg_sync:7413 blocked for more than 241 seconds.
[ 2176.957677] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2176.958240] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2176.958810] task:txg_sync state:D stack: 0 pid: 7413 ppid: 2 flags:0x00004000
[ 2176.958821] Call Trace:
[ 2176.958826]
[ 2176.958833] __schedule+0x24e/0x590
[ 2176.958849] schedule+0x69/0x110
[ 2176.958858] spl_panic+0xe7/0xe9 [spl]
[ 2176.958886] ? dmu_buf_rele+0xe/0x20 [zfs]
[ 2176.959089] ? zap_unlockdir+0x46/0x60 [zfs]
[ 2176.959370] ? zap_add_impl+0x96/0x160 [zfs]
[ 2176.959648] ? zap_add+0x7b/0xb0 [zfs]
[ 2176.959923] dsl_dir_create_sync+0x1ff/0x280 [zfs]
[ 2176.960156] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2176.960178] dsl_dataset_create_sync+0x52/0x380 [zfs]
[ 2176.960407] dmu_recv_begin_sync+0x374/0xa00 [zfs]
[ 2176.960617] ? spa_get_slop_space+0x6e/0xc0 [zfs]
[ 2176.960902] ? __cond_resched+0x1a/0x50
[ 2176.960911] dsl_sync_task_sync+0xb9/0x110 [zfs]
[ 2176.961152] dsl_pool_sync+0x369/0x400 [zfs]
[ 2176.961389] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
[ 2176.961648] spa_sync+0x2dc/0x5b0 [zfs]
[ 2176.961906] txg_sync_thread+0x266/0x2f0 [zfs]
[ 2176.962177] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 2176.962446] thread_generic_wrapper+0x64/0x80 [spl]
[ 2176.962470] ? __thread_exit+0x20/0x20 [spl]
[ 2176.962493] kthread+0x12a/0x150
[ 2176.962503] ? set_kthread_struct+0x50/0x50
[ 2176.962511] ret_from_fork+0x22/0x30
[ 2176.962523]
[ 2176.962570] INFO: task zfs:3276829 blocked for more than 241 seconds.
[ 2176.963159] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2176.963764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2176.964391] task:zfs state:D stack: 0 pid:3276829 ppid: 1 flags:0x00004002
[ 2176.964399] Call Trace:
[ 2176.964402]
[ 2176.964406] __schedule+0x24e/0x590
[ 2176.964412] ? __wake_up_common_lock+0x8a/0xc0
[ 2176.964420] schedule+0x69/0x110
[ 2176.964426] io_schedule+0x46/0x80
[ 2176.964432] cv_wait_common+0xab/0x130 [spl]
[ 2176.964451] ? wait_woken+0x70/0x70
[ 2176.964457] __cv_wait_io+0x18/0x20 [spl]
[ 2176.964475] txg_wait_synced_impl+0x9b/0x120 [zfs]
[ 2176.964760] txg_wait_synced+0x10/0x50 [zfs]
[ 2176.965030] dsl_sync_task_common+0x1c6/0x2a0 [zfs]
[ 2176.965269] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
[ 2176.965478] ? recv_check_large_blocks+0x60/0x60 [zfs]
[ 2176.965688] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
[ 2176.965897] ? recv_check_large_blocks+0x60/0x60 [zfs]
[ 2176.966108] dsl_sync_task+0x1a/0x20 [zfs]
[ 2176.966347] dmu_recv_begin+0x1e2/0x390 [zfs]
[ 2176.966559] zfs_ioc_recv_impl.constprop.0+0x106/0xb20 [zfs]
[ 2176.966837] ? arc_space_return+0x97/0x150 [zfs]
[ 2176.967032] zfs_ioc_recv+0x1b6/0x360 [zfs]
[ 2176.967316] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2176.967337] ? spl_kmem_free+0xe/0x20 [spl]
[ 2176.967358] ? __kmalloc_node+0x166/0x3a0
[ 2176.967367] ? spa_close+0x15/0x20 [zfs]
[ 2176.967632] ? spl_kmem_alloc_impl+0x80/0xd0 [spl]
[ 2176.967654] zfsdev_ioctl_common+0x686/0x740 [zfs]
[ 2176.967930] ? __check_object_size.part.0+0x4a/0x150
[ 2176.967940] ? _copy_from_user+0x31/0x70
[ 2176.967950] zfsdev_ioctl+0x57/0xf0 [zfs]
[ 2176.968215] __x64_sys_ioctl+0x95/0xd0
[ 2176.968226] do_syscall_64+0x5c/0xc0
[ 2176.968233] ? exit_to_user_mode_prepare+0x37/0xb0
[ 2176.968242] ? irqentry_exit_to_user_mode+0x17/0x20
[ 2176.968249] ? irqentry_exit+0x1d/0x30
[ 2176.968254] ? exc_page_fault+0x89/0x170
[ 2176.968260] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 2176.968269] RIP: 0033:0x7f35105a775f
[ 2176.968274] RSP: 002b:00007ffdf35b57e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2176.968281] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f35105a775f
[ 2176.968284] RDX: 00007ffdf35b69e0 RSI: 0000000000005a1b RDI: 0000000000000005
[ 2176.968288] RBP: 00007ffdf35b9fd0 R08: 0000000000000000 R09: 000055d7ec5350a0
[ 2176.968291] R10: 00007f35106a6d30 R11: 0000000000000246 R12: 00007ffdf35b69e0
[ 2176.968294] R13: 0000000000000000 R14: 00007ffdf35b59e0 R15: 00007ffdf35bb880
[ 2176.968300]
[ 2176.968315] INFO: task zpool:3277150 blocked for more than 241 seconds.
[ 2176.968981] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2176.969665] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2176.970353] task:zpool state:D stack: 0 pid:3277150 ppid:309265 flags:0x00000002
[ 2176.970363] Call Trace:
[ 2176.970366]
[ 2176.970370] __schedule+0x24e/0x590
[ 2176.970377] schedule+0x69/0x110
[ 2176.970384] cv_wait_common+0xf8/0x130 [spl]
[ 2176.970403] ? wait_woken+0x70/0x70
[ 2176.970412] __cv_wait+0x15/0x20 [spl]
[ 2176.970431] rrw_enter_read_impl+0x61/0x120 [zfs]
[ 2176.970685] rrw_enter_read+0x13/0x20 [zfs]
[ 2176.970939] rrw_enter+0x21/0x30 [zfs]
[ 2176.971191] dsl_pool_config_enter+0x1d/0x30 [zfs]
[ 2176.971423] spa_prop_get+0x9c/0x3c0 [zfs]
[ 2176.971678] ? kernel_init_free_pages.part.0+0x4a/0x70
[ 2176.971689] ? release_pages+0x170/0x510
[ 2176.971699] ? spa_name_compare+0xe/0x30 [zfs]
[ 2176.971966] ? avl_find+0x5b/0x90 [zavl]
[ 2176.971976] ? do_raw_spin_unlock+0x9/0x10 [zfs]
[ 2176.972230] ? __raw_spin_unlock+0x9/0x10 [zfs]
[ 2176.972483] ? spa_open_common+0x1c6/0x490 [zfs]
[ 2176.972760] ? __kmalloc_node+0x166/0x3a0
[ 2176.972770] zfs_ioc_pool_get_props+0x123/0x140 [zfs]
[ 2176.973068] zfsdev_ioctl_common+0x686/0x740 [zfs]
[ 2176.973344] ? __check_object_size.part.0+0x4a/0x150
[ 2176.973353] ? _copy_from_user+0x31/0x70
[ 2176.973362] zfsdev_ioctl+0x57/0xf0 [zfs]
[ 2176.973630] __x64_sys_ioctl+0x95/0xd0
[ 2176.973638] do_syscall_64+0x5c/0xc0
[ 2176.973644] ? do_user_addr_fault+0x1e7/0x670
[ 2176.973652] ? do_syscall_64+0x69/0xc0
[ 2176.973657] ? exit_to_user_mode_prepare+0x37/0xb0
[ 2176.973665] ? irqentry_exit_to_user_mode+0x17/0x20
[ 2176.973672] ? irqentry_exit+0x1d/0x30
[ 2176.973678] ? exc_page_fault+0x89/0x170
[ 2176.973686] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 2176.973694] RIP: 0033:0x7f3b87dd375f
[ 2176.973698] RSP: 002b:00007ffeae884fc0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2176.973705] RAX: ffffffffffffffda RBX: 000055a452e82430 RCX: 00007f3b87dd375f
[ 2176.973709] RDX: 00007ffeae885020 RSI: 0000000000005a27 RDI: 0000000000000003
[ 2176.973713] RBP: 00007ffeae888600 R08: 0000000000000000 R09: 000055a452f02d00
[ 2176.973716] R10: 000055a452f1f000 R11: 0000000000000246 R12: 00007ffeae885020
[ 2176.973719] R13: 000055a452e72320 R14: 0000000000001000 R15: 000055a452e82430
[ 2176.973726]
[ 2297.791042] INFO: task txg_sync:7413 blocked for more than 362 seconds.
[ 2297.791876] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2297.792623] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2297.793372] task:txg_sync state:D stack: 0 pid: 7413 ppid: 2 flags:0x00004000
[ 2297.793383] Call Trace:
[ 2297.793388]
[ 2297.793394] __schedule+0x24e/0x590
[ 2297.793410] schedule+0x69/0x110
[ 2297.793418] spl_panic+0xe7/0xe9 [spl]
[ 2297.793445] ? dmu_buf_rele+0xe/0x20 [zfs]
[ 2297.793648] ? zap_unlockdir+0x46/0x60 [zfs]
[ 2297.793930] ? zap_add_impl+0x96/0x160 [zfs]
[ 2297.794206] ? zap_add+0x7b/0xb0 [zfs]
[ 2297.794481] dsl_dir_create_sync+0x1ff/0x280 [zfs]
[ 2297.794713] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2297.794757] dsl_dataset_create_sync+0x52/0x380 [zfs]
[ 2297.794987] dmu_recv_begin_sync+0x374/0xa00 [zfs]
[ 2297.795198] ? spa_get_slop_space+0x6e/0xc0 [zfs]
[ 2297.795464] ? __cond_resched+0x1a/0x50
[ 2297.795472] dsl_sync_task_sync+0xb9/0x110 [zfs]
[ 2297.795714] dsl_pool_sync+0x369/0x400 [zfs]
[ 2297.795952] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
[ 2297.796212] spa_sync+0x2dc/0x5b0 [zfs]
[ 2297.796469] txg_sync_thread+0x266/0x2f0 [zfs]
[ 2297.796739] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 2297.797007] thread_generic_wrapper+0x64/0x80 [spl]
[ 2297.797031] ? __thread_exit+0x20/0x20 [spl]
[ 2297.797054] kthread+0x12a/0x150
[ 2297.797063] ? set_kthread_struct+0x50/0x50
[ 2297.797071] ret_from_fork+0x22/0x30
[ 2297.797082]
[ 2297.797126] INFO: task zfs:3276829 blocked for more than 362 seconds.
[ 2297.797895] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2297.798677] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2297.799499] task:zfs state:D stack: 0 pid:3276829 ppid: 1 flags:0x00004002
[ 2297.799506] Call Trace:
[ 2297.799509]
[ 2297.799512] __schedule+0x24e/0x590
[ 2297.799518] ? __wake_up_common_lock+0x8a/0xc0
[ 2297.799528] schedule+0x69/0x110
[ 2297.799535] io_schedule+0x46/0x80
[ 2297.799542] cv_wait_common+0xab/0x130 [spl]
[ 2297.799563] ? wait_woken+0x70/0x70
[ 2297.799571] __cv_wait_io+0x18/0x20 [spl]
[ 2297.799591] txg_wait_synced_impl+0x9b/0x120 [zfs]
[ 2297.799863] txg_wait_synced+0x10/0x50 [zfs]
[ 2297.800131] dsl_sync_task_common+0x1c6/0x2a0 [zfs]
[ 2297.800371] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
[ 2297.800581] ? recv_check_large_blocks+0x60/0x60 [zfs]
[ 2297.800791] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
[ 2297.801001] ? recv_check_large_blocks+0x60/0x60 [zfs]
[ 2297.801212] dsl_sync_task+0x1a/0x20 [zfs]
[ 2297.801451] dmu_recv_begin+0x1e2/0x390 [zfs]
[ 2297.801661] zfs_ioc_recv_impl.constprop.0+0x106/0xb20 [zfs]
[ 2297.801939] ? arc_space_return+0x97/0x150 [zfs]
[ 2297.802134] zfs_ioc_recv+0x1b6/0x360 [zfs]
[ 2297.802416] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2297.802437] ? spl_kmem_free+0xe/0x20 [spl]
[ 2297.802458] ? __kmalloc_node+0x166/0x3a0
[ 2297.802467] ? spa_close+0x15/0x20 [zfs]
[ 2297.802745] ? spl_kmem_alloc_impl+0x80/0xd0 [spl]
[ 2297.802767] zfsdev_ioctl_common+0x686/0x740 [zfs]
[ 2297.803048] ? __check_object_size.part.0+0x4a/0x150
[ 2297.803059] ? _copy_from_user+0x31/0x70
[ 2297.803068] zfsdev_ioctl+0x57/0xf0 [zfs]
[ 2297.803335] __x64_sys_ioctl+0x95/0xd0
[ 2297.803344] do_syscall_64+0x5c/0xc0
[ 2297.803351] ? exit_to_user_mode_prepare+0x37/0xb0
[ 2297.803360] ? irqentry_exit_to_user_mode+0x17/0x20
[ 2297.803367] ? irqentry_exit+0x1d/0x30
[ 2297.803373] ? exc_page_fault+0x89/0x170
[ 2297.803380] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 2297.803389] RIP: 0033:0x7f35105a775f
[ 2297.803395] RSP: 002b:00007ffdf35b57e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2297.803402] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f35105a775f
[ 2297.803407] RDX: 00007ffdf35b69e0 RSI: 0000000000005a1b RDI: 0000000000000005
[ 2297.803411] RBP: 00007ffdf35b9fd0 R08: 0000000000000000 R09: 000055d7ec5350a0
[ 2297.803415] R10: 00007f35106a6d30 R11: 0000000000000246 R12: 00007ffdf35b69e0
[ 2297.803418] R13: 0000000000000000 R14: 00007ffdf35b59e0 R15: 00007ffdf35bb880
[ 2297.803425]
[ 2297.803439] INFO: task zpool:3277150 blocked for more than 362 seconds.
[ 2297.804264] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2297.805108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2297.805971] task:zpool state:D stack: 0 pid:3277150 ppid:309265 flags:0x00000002
[ 2297.805979] Call Trace:
[ 2297.805981]
[ 2297.805984] __schedule+0x24e/0x590
[ 2297.805992] schedule+0x69/0x110
[ 2297.805999] cv_wait_common+0xf8/0x130 [spl]
[ 2297.806017] ? wait_woken+0x70/0x70
[ 2297.806025] __cv_wait+0x15/0x20 [spl]
[ 2297.806042] rrw_enter_read_impl+0x61/0x120 [zfs]
[ 2297.806295] rrw_enter_read+0x13/0x20 [zfs]
[ 2297.806545] rrw_enter+0x21/0x30 [zfs]
[ 2297.806810] dsl_pool_config_enter+0x1d/0x30 [zfs]
[ 2297.807044] spa_prop_get+0x9c/0x3c0 [zfs]
[ 2297.807301] ? kernel_init_free_pages.part.0+0x4a/0x70
[ 2297.807311] ? release_pages+0x170/0x510
[ 2297.807321] ? spa_name_compare+0xe/0x30 [zfs]
[ 2297.807588] ? avl_find+0x5b/0x90 [zavl]
[ 2297.807597] ? do_raw_spin_unlock+0x9/0x10 [zfs]
[ 2297.807853] ? __raw_spin_unlock+0x9/0x10 [zfs]
[ 2297.808107] ? spa_open_common+0x1c6/0x490 [zfs]
[ 2297.808365] ? __kmalloc_node+0x166/0x3a0
[ 2297.808372] zfs_ioc_pool_get_props+0x123/0x140 [zfs]
[ 2297.808646] zfsdev_ioctl_common+0x686/0x740 [zfs]
[ 2297.808921] ? __check_object_size.part.0+0x4a/0x150
[ 2297.808930] ? _copy_from_user+0x31/0x70
[ 2297.808939] zfsdev_ioctl+0x57/0xf0 [zfs]
[ 2297.809203] __x64_sys_ioctl+0x95/0xd0
[ 2297.809211] do_syscall_64+0x5c/0xc0
[ 2297.809217] ? do_user_addr_fault+0x1e7/0x670
[ 2297.809226] ? do_syscall_64+0x69/0xc0
[ 2297.809231] ? exit_to_user_mode_prepare+0x37/0xb0
[ 2297.809239] ? irqentry_exit_to_user_mode+0x17/0x20
[ 2297.809245] ? irqentry_exit+0x1d/0x30
[ 2297.809252] ? exc_page_fault+0x89/0x170
[ 2297.809258] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 2297.809266] RIP: 0033:0x7f3b87dd375f
[ 2297.809271] RSP: 002b:00007ffeae884fc0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2297.809276] RAX: ffffffffffffffda RBX: 000055a452e82430 RCX: 00007f3b87dd375f
[ 2297.809280] RDX: 00007ffeae885020 RSI: 0000000000005a27 RDI: 0000000000000003
[ 2297.809284] RBP: 00007ffeae888600 R08: 0000000000000000 R09: 000055a452f02d00
[ 2297.809286] R10: 000055a452f1f000 R11: 0000000000000246 R12: 00007ffeae885020
[ 2297.809289] R13: 000055a452e72320 R14: 0000000000001000 R15: 000055a452e82430
[ 2297.809295]
[ 2418.625055] INFO: task txg_sync:7413 blocked for more than 483 seconds.
[ 2418.626093] Tainted: P O 5.15.0-91-generic #101-Ubuntu
[ 2418.627026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2418.627954] task:txg_sync state:D stack: 0 pid: 7413 ppid: 2 flags:0x00004000
[ 2418.627966] Call Trace:
[ 2418.627970]
[ 2418.627977] __schedule+0x24e/0x590
[ 2418.627992] schedule+0x69/0x110
[ 2418.628000] spl_panic+0xe7/0xe9 [spl]
[ 2418.628029] ? dmu_buf_rele+0xe/0x20 [zfs]
[ 2418.628230] ? zap_unlockdir+0x46/0x60 [zfs]
[ 2418.628511] ? zap_add_impl+0x96/0x160 [zfs]
[ 2418.628811] ? zap_add+0x7b/0xb0 [zfs]
[ 2418.629089] dsl_dir_create_sync+0x1ff/0x280 [zfs]
[ 2418.629323] ? spl_kmem_free_impl+0x29/0x40 [spl]
[ 2418.629345] dsl_dataset_create_sync+0x52/0x380 [zfs]
[ 2418.629575] dmu_recv_begin_sync+0x374/0xa00 [zfs]
[ 2418.629787] ? spa_get_slop_space+0x6e/0xc0 [zfs]
[ 2418.630052] ? __cond_resched+0x1a/0x50
[ 2418.630061] dsl_sync_task_sync+0xb9/0x110 [zfs]
[ 2418.630301] dsl_pool_sync+0x369/0x400 [zfs]
[ 2418.630535] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
[ 2418.630794] spa_sync+0x2dc/0x5b0 [zfs]
[ 2418.631050] txg_sync_thread+0x266/0x2f0 [zfs]
[ 2418.631319] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 2418.631588] thread_generic_wrapper+0x64/0x80 [spl]
[ 2418.631611] ? __thread_exit+0x20/0x20 [spl]
[ 2418.631634] kthread+0x12a/0x150
[ 2418.631644] ? set_kthread_struct+0x50/0x50
[ 2418.631651] ret_from_fork+0x22/0x30
[ 2418.631663]

@minorsatellite minorsatellite added the Type: Defect Incorrect behavior (e.g. crash, hang) label Dec 14, 2023
@rincebrain
Copy link
Contributor

Since it seems to have gone missing from the first line in this mailing list copy paste, the quote should open with
PANIC at dsl_dir.c:951:dsl_dir_create_sync()

which is, itself, likely missing the message above it of VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17) or the like, meaning it's likely resolved by #14119 in 2.1.7+.

@minorsatellite
Copy link
Author

Yes sorry, the output got truncated. I added it back.

@minorsatellite
Copy link
Author

I should add that keys are loaded on the receive side and the data is unencrypted at the send side though it is using LUKS for full disk encryption.

@rincebrain
Copy link
Contributor

That's okay, my advice is still try upgrading to 2.1.14 or 2.2.2 and see if it keeps happening.

@minorsatellite
Copy link
Author

I can confirm that upgrading the receive side to 2.2.2 resolved the issue, sending 4TB of data in less than 24hrs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants