-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native ZPL Implementation #7
Labels
Type: Feature
Feature request or new feature
Comments
I found a recent project by Daniel Lorch to port HammerFS from Dragonfly BSD to Linux (http://github.com/dlorch/hammer-linux). Maybe we can use part of his code to implement ZPL due to the similarity between Solaris VFS and Dragonfly BSD VFS? |
The ZPL posix layer is now available in 0.6.0-rc1. It will be stabilized for the full 0.6.0 release. |
Closed
Closed
FransUrbo
pushed a commit
to FransUrbo/zfs
that referenced
this issue
Mar 30, 2013
Dealing with ZVOL, part 1
tuxoko
added a commit
to tuxoko/zfs
that referenced
this issue
Apr 1, 2015
The following panic would occur under certain heavy load: [ 4692.202686] Kernel panic - not syncing: thread ffff8800c4f5dd60 terminating with rrw lock ffff8800da1b9c40 held [ 4692.228053] CPU: 1 PID: 6250 Comm: mmap_deadlock Tainted: P OE 3.18.10 openzfs#7 The culprit is that ZFS_EXIT(zsb) would call tsd_exit() every time, which would purge all tsd data for the thread. However, ZFS_ENTER is designed to be reentrant, so we cannot allow ZFS_EXIT to blindly purge tsd data. Instead, when we are doing rrw_exit, if we are removing the last rrn entry, then we calls tsd_exit_key(rrw_tsd_key), which would only remove the rrw_tsd_key tsd entry and also the PID_KEY tsd entry if it is the only entry left for this thread. Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
tuxoko
added a commit
to tuxoko/zfs
that referenced
this issue
Apr 2, 2015
The following panic would occur under certain heavy load: [ 4692.202686] Kernel panic - not syncing: thread ffff8800c4f5dd60 terminating with rrw lock ffff8800da1b9c40 held [ 4692.228053] CPU: 1 PID: 6250 Comm: mmap_deadlock Tainted: P OE 3.18.10 openzfs#7 The culprit is that ZFS_EXIT(zsb) would call tsd_exit() every time, which would purge all tsd data for the thread. However, ZFS_ENTER is designed to be reentrant, so we cannot allow ZFS_EXIT to blindly purge tsd data. Instead, when we are doing rrw_exit, if we are removing the last rrn entry, then we calls tsd_exit_key(rrw_tsd_key), which would only remove the rrw_tsd_key tsd entry and also the PID_KEY tsd entry if it is the only entry left for this thread. Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
tuxoko
added a commit
to tuxoko/zfs
that referenced
this issue
Apr 3, 2015
The following panic would occur under certain heavy load: [ 4692.202686] Kernel panic - not syncing: thread ffff8800c4f5dd60 terminating with rrw lock ffff8800da1b9c40 held [ 4692.228053] CPU: 1 PID: 6250 Comm: mmap_deadlock Tainted: P OE 3.18.10 openzfs#7 The culprit is that ZFS_EXIT(zsb) would call tsd_exit() every time, which would purge all tsd data for the thread. However, ZFS_ENTER is designed to be reentrant, so we cannot allow ZFS_EXIT to blindly purge tsd data. Instead, when we are doing rrw_exit, if we are removing the last rrn entry, then we calls tsd_exit_key(rrw_tsd_key), which would only remove the rrw_tsd_key tsd entry and also the PID_KEY tsd entry if it is the only entry left for this thread. The zfs_fsyncer_key tsds rely on ZFS_EXIT(zsb) to call tsd_exit() to do clean up. Now we need to explicit call tsd_exit_key() on them. We also clean up the zfs_allow_log_key when it's not needed. Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
tuxoko
added a commit
to tuxoko/zfs
that referenced
this issue
Apr 15, 2015
The following panic would occur under certain heavy load: [ 4692.202686] Kernel panic - not syncing: thread ffff8800c4f5dd60 terminating with rrw lock ffff8800da1b9c40 held [ 4692.228053] CPU: 1 PID: 6250 Comm: mmap_deadlock Tainted: P OE 3.18.10 openzfs#7 The culprit is that ZFS_EXIT(zsb) would call tsd_exit() every time, which would purge all tsd data for the thread. However, ZFS_ENTER is designed to be reentrant, so we cannot allow ZFS_EXIT to blindly purge tsd data. Instead, we rely on the new behavior of tsd_set. When NULL is passed as the new value to tsd_set, it will automatically remove the tsd entry specified the the key for the current thread. rrw_tsd_key and zfs_allow_log_key already calls tsd_set(key, NULL) when they're done. The zfs_fsyncer_key relied on ZFS_EXIT(zsb) to call tsd_exit() to do clean up. Now we explicitly call tsd_set(key, NULL) on them. Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this issue
Apr 24, 2015
The following panic would occur under certain heavy load: [ 4692.202686] Kernel panic - not syncing: thread ffff8800c4f5dd60 terminating with rrw lock ffff8800da1b9c40 held [ 4692.228053] CPU: 1 PID: 6250 Comm: mmap_deadlock Tainted: P OE 3.18.10 openzfs#7 The culprit is that ZFS_EXIT(zsb) would call tsd_exit() every time, which would purge all tsd data for the thread. However, ZFS_ENTER is designed to be reentrant, so we cannot allow ZFS_EXIT to blindly purge tsd data. Instead, we rely on the new behavior of tsd_set. When NULL is passed as the new value to tsd_set, it will automatically remove the tsd entry specified the the key for the current thread. rrw_tsd_key and zfs_allow_log_key already calls tsd_set(key, NULL) when they're done. The zfs_fsyncer_key relied on ZFS_EXIT(zsb) to call tsd_exit() to do clean up. Now we explicitly call tsd_set(key, NULL) on them. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#3247
ryao
added a commit
to ryao/zfs
that referenced
this issue
Dec 3, 2015
(gdb) bt \#0 0x00007f31952a35d7 in raise () from /lib64/libc.so.6 \#1 0x00007f31952a4cc8 in abort () from /lib64/libc.so.6 \#2 0x00007f319529c546 in __assert_fail_base () from /lib64/libc.so.6 \#3 0x00007f319529c5f2 in __assert_fail () from /lib64/libc.so.6 \#4 0x00007f319529c66b in __assert () from /lib64/libc.so.6 \#5 0x00007f319659e024 in send_iterate_prop (zhp=zhp@entry=0x19cf500, nv=0x19da110) at libzfs_sendrecv.c:724 \openzfs#6 0x00007f319659e35d in send_iterate_fs (zhp=zhp@entry=0x19cf500, arg=arg@entry=0x7ffc515d6380) at libzfs_sendrecv.c:762 \openzfs#7 0x00007f319659f3ca in gather_nvlist (hdl=<optimized out>, fsname=fsname@entry=0x19cf250 "tpool/test", fromsnap=fromsnap@entry=0x7ffc515d7531 "snap1", tosnap=tosnap@entry=0x7ffc515d7542 "snap2", recursive=B_FALSE, nvlp=nvlp@entry=0x7ffc515d6470, avlp=avlp@entry=0x7ffc515d6478) at libzfs_sendrecv.c:809 \openzfs#8 0x00007f31965a408f in zfs_send (zhp=zhp@entry=0x19cf240, fromsnap=fromsnap@entry=0x7ffc515d7531 "snap1", tosnap=tosnap@entry=0x7ffc515d7542 "snap2", flags=flags@entry=0x7ffc515d6d30, outfd=outfd@entry=1, filter_func=filter_func@entry=0x0, cb_arg=cb_arg@entry=0x0, debugnvp=debugnvp@entry=0x0) at libzfs_sendrecv.c:1461 \openzfs#9 0x000000000040a981 in zfs_do_send (argc=<optimized out>, argv=0x7ffc515d6ff0) at zfs_main.c:3841 \openzfs#10 0x0000000000404d10 in main (argc=6, argv=0x7ffc515d6fc8) at zfs_main.c:6724 (gdb) fr 5 \#5 0x00007f319659e024 in send_iterate_prop (zhp=zhp@entry=0x19cf500, nv=0x19da110) at libzfs_sendrecv.c:724 724 verify(nvlist_lookup_uint64(propnv, (gdb) list 719 verify(nvlist_lookup_string(propnv, 720 ZPROP_VALUE, &value) == 0); 721 VERIFY(0 == nvlist_add_string(nv, propname, value)); 722 } else { 723 uint64_t value; 724 verify(nvlist_lookup_uint64(propnv, 725 ZPROP_VALUE, &value) == 0); 726 VERIFY(0 == nvlist_add_uint64(nv, propname, value)); 727 } 728 } (gdb) p prop $1 = ZFS_PROP_RELATIME
Closed
Closed
richardelling
pushed a commit
to richardelling/zfs
that referenced
this issue
Oct 15, 2018
[CCS-69] codecov integration with ZoL
richardelling
pushed a commit
to richardelling/zfs
that referenced
this issue
Oct 15, 2018
* [cstor#38] honoring sync option in uzfs Signed-off-by: Vishnu Itta <vitta@mayadata.io>
12 tasks
ahrens
referenced
this issue
in ahrens/zfs
Dec 9, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address 0x608000a1fcd0 at pc 0x7fe88b0c166e bp 0x7fe878414ad0 sp 0x7fe878414278 READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup ../../module/zfs/spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove ../../module/zfs/vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #4 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #6 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #7 0x7fe8899e588e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e) 0x608000a1fcd0 is located 48 bytes inside of 88-byte region [0x608000a1fca0,0x608000a1fcf8) freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7b8) #1 0x7fe88ae541c5 in nvlist_free ../../module/nvpair/nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free ../../module/nvpair/nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair ../../module/nvpair/nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux ../../module/zfs/vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove ../../module/zfs/vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #7 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #9 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
ahrens
referenced
this issue
in ahrens/zfs
Dec 9, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address 0x608000a1fcd0 at pc 0x7fe88b0c166e bp 0x7fe878414ad0 sp 0x7fe878414278 READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup ../../module/zfs/spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove ../../module/zfs/vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #4 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #6 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #7 0x7fe8899e588e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e) 0x608000a1fcd0 is located 48 bytes inside of 88-byte region [0x608000a1fca0,0x608000a1fcf8) freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7b8) #1 0x7fe88ae541c5 in nvlist_free ../../module/nvpair/nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free ../../module/nvpair/nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair ../../module/nvpair/nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux ../../module/zfs/vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove ../../module/zfs/vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #7 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #9 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
ahrens
referenced
this issue
in ahrens/zfs
Dec 10, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
behlendorf
pushed a commit
that referenced
this issue
Dec 11, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #9706
tonyhutter
pushed a commit
to tonyhutter/zfs
that referenced
this issue
Dec 26, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#4 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#5 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#6 0x7fe889cbc6da in start_thread openzfs#7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 openzfs#4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 openzfs#5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 openzfs#6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#7 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#8 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#9706
tonyhutter
pushed a commit
to tonyhutter/zfs
that referenced
this issue
Dec 27, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#4 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#5 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#6 0x7fe889cbc6da in start_thread openzfs#7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 openzfs#4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 openzfs#5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 openzfs#6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#7 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#8 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#9706
tonyhutter
pushed a commit
that referenced
this issue
Jan 23, 2020
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #9706
markroper
added a commit
to markroper/zfs
that referenced
this issue
Feb 12, 2020
Using zfs with Lustre, an arc_read can trigger kernel memory allocation that in turn leads to a memory reclaim callback and a deadlock within a single zfs process. This change uses spl_fstrans_mark and spl_trans_unmark to prevent the reclaim attempt and the deadlock (https://zfsonlinux.topicbox.com/groups/zfs-devel/T4db2c705ec1804ba). The stack trace observed is: #0 [ffffc9002b98adc8] __schedule at ffffffff81610f2e openzfs#1 [ffffc9002b98ae68] schedule at ffffffff81611558 openzfs#2 [ffffc9002b98ae70] schedule_preempt_disabled at ffffffff8161184a openzfs#3 [ffffc9002b98ae78] __mutex_lock at ffffffff816131e8 openzfs#4 [ffffc9002b98af18] arc_buf_destroy at ffffffffa0bf37d7 [zfs] openzfs#5 [ffffc9002b98af48] dbuf_destroy at ffffffffa0bfa6fe [zfs] openzfs#6 [ffffc9002b98af88] dbuf_evict_one at ffffffffa0bfaa96 [zfs] openzfs#7 [ffffc9002b98afa0] dbuf_rele_and_unlock at ffffffffa0bfa561 [zfs] openzfs#8 [ffffc9002b98b050] dbuf_rele_and_unlock at ffffffffa0bfa32b [zfs] openzfs#9 [ffffc9002b98b100] osd_object_delete at ffffffffa0b64ecc [osd_zfs] openzfs#10 [ffffc9002b98b118] lu_object_free at ffffffffa06d6a74 [obdclass] openzfs#11 [ffffc9002b98b178] lu_site_purge_objects at ffffffffa06d7fc1 [obdclass] openzfs#12 [ffffc9002b98b220] lu_cache_shrink_scan at ffffffffa06d81b8 [obdclass] openzfs#13 [ffffc9002b98b278] shrink_slab at ffffffff811ca9d8 openzfs#14 [ffffc9002b98b338] shrink_node at ffffffff811cfd94 openzfs#15 [ffffc9002b98b3b8] do_try_to_free_pages at ffffffff811cfe63 openzfs#16 [ffffc9002b98b408] try_to_free_pages at ffffffff811d01c4 openzfs#17 [ffffc9002b98b488] __alloc_pages_slowpath at ffffffff811be7f2 openzfs#18 [ffffc9002b98b580] __alloc_pages_nodemask at ffffffff811bf3ed openzfs#19 [ffffc9002b98b5e0] new_slab at ffffffff81226304 openzfs#20 [ffffc9002b98b638] ___slab_alloc at ffffffff812272ab openzfs#21 [ffffc9002b98b6f8] __slab_alloc at ffffffff8122740c openzfs#22 [ffffc9002b98b708] kmem_cache_alloc at ffffffff81227578 openzfs#23 [ffffc9002b98b740] spl_kmem_cache_alloc at ffffffffa048a1fd [spl] openzfs#24 [ffffc9002b98b780] arc_buf_alloc_impl at ffffffffa0befba2 [zfs] openzfs#25 [ffffc9002b98b7b0] arc_read at ffffffffa0bf0924 [zfs] openzfs#26 [ffffc9002b98b858] dbuf_read at ffffffffa0bf9083 [zfs] openzfs#27 [ffffc9002b98b900] dmu_buf_hold_by_dnode at ffffffffa0c04869 [zfs] Signed-off-by: Mark Roper <markroper@gmail.com>
allanjude
pushed a commit
to KlaraSystems/zfs
that referenced
this issue
Apr 28, 2020
Evict refcount=1 entries when DDT large
12 tasks
problame
added a commit
to problame/zfs
that referenced
this issue
Oct 10, 2020
This is a fixup of commit 0fdd610 See added test case for a reproducer. Stack trace: panic: VERIFY3(nvlist_next_nvpair(redactnvl, pair) == NULL) failed (0xfffff80003ce5d18x == 0x) cpuid = 7 time = 1602212370 KDB: stack backtrace: #0 0xffffffff80c1d297 at kdb_backtrace+0x67 openzfs#1 0xffffffff80bd05cd at vpanic+0x19d openzfs#2 0xffffffff828446fa at spl_panic+0x3a openzfs#3 0xffffffff828af85d at dmu_redact_snap+0x39d openzfs#4 0xffffffff829c0370 at zfs_ioc_redact+0xa0 openzfs#5 0xffffffff829bba44 at zfsdev_ioctl_common+0x4a4 openzfs#6 0xffffffff8284c3ed at zfsdev_ioctl+0x14d openzfs#7 0xffffffff80a85ead at devfs_ioctl+0xad openzfs#8 0xffffffff8122a46c at VOP_IOCTL_APV+0x7c openzfs#9 0xffffffff80cb0a3a at vn_ioctl+0x16a openzfs#10 0xffffffff80a8649f at devfs_ioctl_f+0x1f openzfs#11 0xffffffff80c3b55e at kern_ioctl+0x2be openzfs#12 0xffffffff80c3b22d at sys_ioctl+0x15d openzfs#13 0xffffffff810a88e4 at amd64_syscall+0x364 openzfs#14 0xffffffff81082330 at fast_syscall_common+0x101 Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closed
sdimitro
pushed a commit
to sdimitro/zfs
that referenced
this issue
Nov 11, 2021
Closed
rob-wing
pushed a commit
to KlaraSystems/zfs
that referenced
this issue
Feb 17, 2023
Under certain loads, the following panic is hit: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 openzfs#4 0xffffffff8066fdee at vinactivef+0xde openzfs#5 0xffffffff80670b8a at vgonel+0x1ea openzfs#6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 A race condition can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Sponsored-by: rsync.net Sponsored-by: Klara, Inc.
rob-wing
pushed a commit
to KlaraSystems/zfs
that referenced
this issue
Feb 17, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 openzfs#4 0xffffffff808adc6f at trap_pfault+0x4f openzfs#5 0xffffffff80886da8 at calltrap+0x8 openzfs#6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 openzfs#4 0xffffffff8066fdee at vinactivef+0xde openzfs#5 0xffffffff80670b8a at vgonel+0x1ea openzfs#6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net
behlendorf
pushed a commit
that referenced
this issue
Feb 22, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this issue
May 28, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
behlendorf
pushed a commit
that referenced
this issue
May 30, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501
EchterAgo
pushed a commit
to EchterAgo/zfs
that referenced
this issue
Aug 4, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 openzfs#1 0xffffffff8058e86f at vpanic+0x17f openzfs#2 0xffffffff8058e6e3 at panic+0x43 openzfs#3 0xffffffff808adc15 at trap_fatal+0x385 openzfs#4 0xffffffff808adc6f at trap_pfault+0x4f openzfs#5 0xffffffff80886da8 at calltrap+0x8 openzfs#6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 openzfs#1 0xffffffff8059620f at vpanic+0x17f openzfs#2 0xffffffff81a27f4a at spl_panic+0x3a openzfs#3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 openzfs#4 0xffffffff8066fdee at vinactivef+0xde openzfs#5 0xffffffff80670b8a at vgonel+0x1ea openzfs#6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
This issue was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The major component missing from this port is the ZPL. This is a huge amount of work given the differences between a vnode based VFS like Solaris and Linux's own unique VFS. But this needs to be done and it very achievable now that we have a solid port of the DMU and SPA for the Linux kernel.
The text was updated successfully, but these errors were encountered: