-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
list a zfs directory hang #1930
Comments
This may be related to #1890. Sorry, I don't have a quick fix for you, although I suspect this only impacts a very small numbers of files. If you can identify and them and quarantine them for now you can avoid this issue. This isn't a new issue, just a rare one, so I doubt it was related to the system upgrade. |
Thanks for your response. Now I did |
Update. This issue happened again after did |
@behlendorf do you know how to find those problematic directories? This happened more frequently caused us not be able to work. Thanks. |
@lidaof My only quick suggestion is to try the following patch. It detects the error and instead of making it fatal returns EINVAL to the higher layers. This may allow you just to get EINVAL errors for the offending files instead of crashing the node. diff --git a/module/zfs/sa.c b/module/zfs/sa.c
index 117d386..eaedb53 100644
--- a/module/zfs/sa.c
+++ b/module/zfs/sa.c
@@ -1300,7 +1300,11 @@ sa_build_index(sa_handle_t *hdl, sa_buf_type_t buftype)
/* only check if not old znode */
if (IS_SA_BONUSTYPE(bonustype) && sa_hdr_phys->sa_magic != SA_MAGIC &&
sa_hdr_phys->sa_magic != 0) {
- VERIFY(BSWAP_32(sa_hdr_phys->sa_magic) == SA_MAGIC);
+ if (BSWAP_32(sa_hdr_phys->sa_magic) != SA_MAGIC) {
+ mutex_exit(&sa->sa_lock);
+ return (EINVAL);
+ }
+
sa_byteswap(hdl, buftype);
}
|
@behlendorf Thank you very much. |
There have been several fixes applied to master regarding corrupted SA which could have caused this issue. Since those problems have been resolved I'm closing this issue. |
Hi All,
I recently did an system upgrade and seems got some problem with my system
I was not able to list some of the directory, and there are lots of errors in syslog shown below.
I am using Ubuntu 12.04.3 LTS. Any one have some suggestions to fix it?
Please let me know if you need more information. Many thanks.
Dec 5 09:27:16 corona kernel: [ 956.594281] VERIFY(BSWAP_32(sa_hdr_phys->sa_magic) == SA_MAGIC) failed
Dec 5 09:27:16 corona kernel: [ 956.594720] SPLError: 3729:0:(sa.c:1303:sa_build_index()) SPL PANIC
Dec 5 09:27:16 corona kernel: [ 956.595076] SPL: Showing stack for process 3729
Dec 5 09:27:16 corona kernel: [ 956.595081] Pid: 3729, comm: ls Tainted: P O 3.2.0-23-generic #36-Ubuntu
Dec 5 09:27:16 corona kernel: [ 956.595083] Call Trace:
Dec 5 09:27:16 corona kernel: [ 956.595103] [] spl_debug_dumpstack+0x27/0x40 [spl]
Dec 5 09:27:16 corona kernel: [ 956.595111] [] spl_debug_bug+0x82/0xe0 [spl]
Dec 5 09:27:16 corona kernel: [ 956.595153] [] sa_build_index+0x10e/0x110 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595187] [] sa_handle_get_from_db+0xda/0x120 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595224] [] zfs_znode_sa_init.isra.7+0x9f/0xd0 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595260] [] zfs_znode_alloc+0xdc/0x540 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595296] [] ? zio_wait+0x12d/0x1c0 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595317] [] ? dbuf_read+0x337/0x860 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595337] [] ? dbuf_create+0x325/0x370 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595345] [] ? mutex_lock+0x1d/0x50
Dec 5 09:27:16 corona kernel: [ 956.595352] [] ? default_spin_lock_flags+0x9/0x10
Dec 5 09:27:16 corona kernel: [ 956.595375] [] ? dmu_object_info_from_dnode+0x144/0x1b0 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595411] [] zfs_zget+0x168/0x200 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595447] [] ? zap_lookup_norm+0xd1/0x1c0 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595482] [] zfs_dirent_lock+0x4c3/0x5d0 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595518] [] zfs_dirlook+0x8b/0x300 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595554] [] ? zfs_zaccess+0x9d/0x430 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595565] [] ? tsd_exit+0x2a0/0x2d0 [spl]
Dec 5 09:27:16 corona kernel: [ 956.595601] [] zfs_lookup+0x2e1/0x330 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595636] [] zpl_lookup+0x78/0xf0 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595641] [] ? _raw_spin_lock+0xe/0x20
Dec 5 09:27:16 corona kernel: [ 956.595646] [] d_alloc_and_lookup+0x45/0x90
Dec 5 09:27:16 corona kernel: [ 956.595653] [] ? d_lookup+0x35/0x60
Dec 5 09:27:16 corona kernel: [ 956.595657] [] do_lookup+0x202/0x310
Dec 5 09:27:16 corona kernel: [ 956.595661] [] ? dput+0x1e6/0x290
Dec 5 09:27:16 corona kernel: [ 956.595665] [] path_lookupat+0x11c/0x750
Dec 5 09:27:16 corona kernel: [ 956.595673] [] ? __strncpy_from_user+0x27/0x60
Dec 5 09:27:16 corona kernel: [ 956.595677] [] do_path_lookup+0x31/0xc0
Dec 5 09:27:16 corona kernel: [ 956.595681] [] user_path_at_empty+0x59/0xa0
Dec 5 09:27:16 corona kernel: [ 956.595717] [] ? zfs_getattr_fast+0xd9/0x160 [zfs]
Dec 5 09:27:16 corona kernel: [ 956.595721] [] ? _raw_spin_lock+0xe/0x20
Dec 5 09:27:16 corona kernel: [ 956.595728] [] ? cp_new_stat+0xf8/0x110
Dec 5 09:27:16 corona kernel: [ 956.595732] [] user_path_at+0x11/0x20
Dec 5 09:27:16 corona kernel: [ 956.595736] [] vfs_fstatat+0x3a/0x70
Dec 5 09:27:16 corona kernel: [ 956.595740] [] vfs_lstat+0x1e/0x20
Dec 5 09:27:16 corona kernel: [ 956.595744] [] sys_newlstat+0x1a/0x40
Dec 5 09:27:16 corona kernel: [ 956.595750] [] system_call_fastpath+0x16/0x1b
The text was updated successfully, but these errors were encountered: