Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ZFS_DEBUG_MODIFY assert in arc_buf_try_copy_decompressed_data #10943

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions module/zfs/arc.c
Original file line number Diff line number Diff line change
Expand Up @@ -1705,6 +1705,7 @@ arc_buf_try_copy_decompressed_data(arc_buf_t *buf)
{
arc_buf_hdr_t *hdr = buf->b_hdr;
boolean_t copied = B_FALSE;
int nbufs = 0;

ASSERT(HDR_HAS_L1HDR(hdr));
ASSERT3P(buf->b_data, !=, NULL);
Expand All @@ -1716,7 +1717,7 @@ arc_buf_try_copy_decompressed_data(arc_buf_t *buf)
if (from == buf) {
continue;
}

nbufs++;
if (!ARC_BUF_COMPRESSED(from)) {
bcopy(from->b_data, buf->b_data, arc_buf_size(buf));
copied = B_TRUE;
Expand All @@ -1728,9 +1729,18 @@ arc_buf_try_copy_decompressed_data(arc_buf_t *buf)
* There were no decompressed bufs, so there should not be a
* checksum on the hdr either.
*/
if (zfs_flags & ZFS_DEBUG_MODIFY)
EQUIV(!copied, hdr->b_l1hdr.b_freeze_cksum == NULL);

if (zfs_flags & ZFS_DEBUG_MODIFY) {
if (nbufs)
EQUIV(!copied, hdr->b_l1hdr.b_freeze_cksum == NULL);
else if (!ARC_BUF_COMPRESSED(buf) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!ARC_BUF_COMPRESSED(buf) is unnecessary since that is already asserted on line 1712.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there is more to this problem than what is being addressed here. It appears that we've called arc_buf_try_copy_decompressed_data with a buffer which has been frozen and this function is designed to overwrite the data. I don't feel like we need to relax the assertion without understanding why we have a frozen arc_buf_t. It feels like the real fix should be to thaw the arc_buf_t before calling this function but we should understand where the real bug lives.

Any suggestions on diagnosing that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a stack trace from the assertion?

Copy link
Contributor Author

@mattmacy mattmacy Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grwilson This is with latest master. Let me know if there's any other state you'd like to see. Thanks.

#0  __curthread () at /usr/home/matt/devel/freebsd/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=-2121375832) at /usr/home/matt/devel/freebsd/sys/kern/kern_shutdown.c:394
#2  0xffffffff8049fcbe in db_fncall_generic (addr=<optimized out>, rv=<optimized out>, nargs=0, 
    args=<optimized out>) at /usr/home/matt/devel/freebsd/sys/ddb/db_command.c:610
#3  db_fncall (dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /usr/home/matt/devel/freebsd/sys/ddb/db_command.c:658
#4  0xffffffff8049f7f0 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=1)
    at /usr/home/matt/devel/freebsd/sys/ddb/db_command.c:482
#5  0xffffffff8049f54d in db_command_loop () at /usr/home/matt/devel/freebsd/sys/ddb/db_command.c:535
#6  0xffffffff804a2866 in db_trap (type=<optimized out>, code=<optimized out>)
    at /usr/home/matt/devel/freebsd/sys/ddb/db_main.c:270
#7  0xffffffff80c22a24 in kdb_trap (type=3, code=0, tf=<optimized out>)
    at /usr/home/matt/devel/freebsd/sys/kern/subr_kdb.c:699
#8  0xffffffff8101df9e in trap (frame=0xfffffe08730aee70)
    at /usr/home/matt/devel/freebsd/sys/amd64/amd64/trap.c:576
#9  <signal handler called>
#10 kdb_enter (why=0xffffffff811ec400 "panic", msg=<optimized out>)
    at /usr/home/matt/devel/freebsd/sys/kern/subr_kdb.c:486
#11 0xffffffff80bd728e in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/home/matt/devel/freebsd/sys/kern/kern_shutdown.c:902
#12 0xffffffff8235c0ef in spl_panic (file=0xffffffff8275d461 "/home/matt/devel/ZoF/module/zfs/arc.c", 
    func=0xffffffff8279036c "arc_buf_try_copy_decompressed_data", line=1732, 
    fmt=0xffffffff82742983 "(!copied) is equivalent to (hdr->b_l1hdr.b_freeze_cksum == NULL)")
    at /home/matt/devel/ZoF/module/os/freebsd/spl/spl_misc.c:107
#13 0xffffffff823ef79d in arc_buf_try_copy_decompressed_data (buf=0xfffff8008df33f00)
    at /home/matt/devel/ZoF/module/zfs/arc.c:1732
#14 0xffffffff823d61aa in arc_buf_fill (buf=0xfffff8008df33f00, spa=0xfffffe0872fc8000, zb=0xfffffe08730af288, 
    flags=(unknown: 0)) at /home/matt/devel/ZoF/module/zfs/arc.c:2149
#15 0xffffffff823d55a5 in arc_untransform (buf=0xfffff8008df33f00, spa=0xfffffe0872fc8000, 
    zb=0xfffffe08730af288, in_place=0) at /home/matt/devel/ZoF/module/zfs/arc.c:2207
#16 0xffffffff82403fb4 in dbuf_read (db=0xfffff8008dac8990, zio=0x0, flags=10)
    at /home/matt/devel/ZoF/module/zfs/dbuf.c:1637
#17 0xffffffff824278fe in dmu_buf_hold (os=0xfffff8008df5e000, object=1, offset=0, tag=0xffffffff8277e210, 
    dbp=0xfffffe08730af3d8, flags=1) at /home/matt/devel/ZoF/module/zfs/dmu.c:243
#18 0xffffffff825f629f in zap_lockdir (os=0xfffff8008df5e000, obj=1, tx=0x0, lti=RW_READER, fatreader=1, 
    adding=0, tag=0xffffffff8277e210, zapp=0xfffffe08730af478)
    at /home/matt/devel/ZoF/module/zfs/zap_micro.c:613
#19 0xffffffff825f7c96 in zap_lookup_norm (os=0xfffff8008df5e000, zapobj=1, 
    name=0xffffffff827596ca "normalization", integer_size=8, num_integers=1, buf=0xfffffe08730af5d8, 
    mt=(unknown: 0), realname=0x0, rn_len=0, ncp=0x0) at /home/matt/devel/ZoF/module/zfs/zap_micro.c:1015
#20 0xffffffff825f7beb in zap_lookup (os=0xfffff8008df5e000, zapobj=1, name=0xffffffff827596ca "normalization", 
    integer_size=8, num_integers=1, buf=0xfffffe08730af5d8) at /home/matt/devel/ZoF/module/zfs/zap_micro.c:962
#21 0xffffffff8238ac49 in zfs_get_zplprop (os=0xfffff8008df5e000, prop=ZFS_PROP_NORMALIZE, 
    value=0xfffffe08730af5d8) at /home/matt/devel/ZoF/module/os/freebsd/zfs/zfs_vfsops.c:2203
#22 0xffffffff82622923 in nvl_add_zplprop (os=0xfffff8008df5e000, props=0xfffff80008d98f60, 
    prop=ZFS_PROP_NORMALIZE) at /home/matt/devel/ZoF/module/zfs/zfs_ioctl.c:2136
#23 0xffffffff8261c9f5 in zfs_ioc_objset_zplprops (zc=0xfffffe0876608000)
    at /home/matt/devel/ZoF/module/zfs/zfs_ioctl.c:2175
#24 0xffffffff82613228 in zfsdev_ioctl_common (vecnum=19, zc=0xfffffe0876608000, flag=0)
    at /home/matt/devel/ZoF/module/zfs/zfs_ioctl.c:7547
#25 0xffffffff8236b1d4 in zfsdev_ioctl (dev=0xfffff8008df8dc00, zcmd=3222821395, arg=0xfffffe08730afa20 "\017", 
    flag=3, td=0xfffffe0872d79000) at /home/matt/devel/ZoF/module/os/freebsd/zfs/kmod_core.c:170
#26 0xffffffff80a7b41c in devfs_ioctl (ap=0xfffffe08730af878)
    at /usr/home/matt/devel/freebsd/sys/fs/devfs/devfs_vnops.c:944
#27 0xffffffff80ccc432 in vn_ioctl (fp=0xfffff8000c0066e0, com=<optimized out>, data=0xfffffe08730afa20, 
    active_cred=0xfffff800ad0f4c00, td=0x304f) at /usr/home/matt/devel/freebsd/sys/kern/vfs_vnops.c:1602
#28 0xffffffff80a7ba8e in devfs_ioctl_f (fp=0xffffffff81c8cbc8 <cnputs_mtx>, com=128, data=0xfffffe08730aee30, 
    cred=0xffffffff811d0a35, td=0xfffffe0872d79000)
    at /usr/home/matt/devel/freebsd/sys/fs/devfs/devfs_vnops.c:875
#29 0xffffffff80c4a01e in fo_ioctl (fp=<optimized out>, com=3222821395, data=0x1c200020, 
    active_cred=0xffffffff811d0a35, td=0xfffffe0872d79000) at /usr/home/matt/devel/freebsd/sys/sys/file.h:340
#30 kern_ioctl (td=<optimized out>, fd=<optimized out>, com=<optimized out>, 
    data=0x1c200020 <error: Cannot access memory at address 0x1c200020>)
    at /usr/home/matt/devel/freebsd/sys/kern/sys_generic.c:802
#31 0xffffffff80c49ce7 in sys_ioctl (td=0xfffffe0872d79000, uap=0xfffffe0872d793e8)
    at /usr/home/matt/devel/freebsd/sys/kern/sys_generic.c:710
#32 0xffffffff8101ede0 in syscallenter (td=<optimized out>)
    at /usr/home/matt/devel/freebsd/sys/amd64/amd64/../../kern/subr_syscall.c:162
#33 amd64_syscall (td=0xfffffe0872d79000, traced=0) at /usr/home/matt/devel/freebsd/sys/amd64/amd64/trap.c:1156
#34 <signal handler called>
#35 0x000000080084ccba in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffc688
#14 0xffffffff823d61aa in arc_buf_fill (buf=0xfffff8008df33f00, spa=0xfffffe0872fc8000, zb=0xfffffe08730af288, 
    flags=(unknown: 0)) at /home/matt/devel/ZoF/module/zfs/arc.c:2149
2149			if (arc_buf_try_copy_decompressed_data(buf)) {
2144			/*
2145			 * Try copying the data from another buf which already has a
2146			 * decompressed version. If that's not possible, it's time to
2147			 * bite the bullet and decompress the data from the hdr.
2148			 */
2149			if (arc_buf_try_copy_decompressed_data(buf)) {
2150				/* Skip byteswapping and checksumming (already done) */
2151				return (0);
2152			} else {
2153				error = zio_decompress_data(HDR_GET_COMPRESS(hdr),
#13 0xffffffff823ef79d in arc_buf_try_copy_decompressed_data (buf=0xfffff8008df33f00)
    at /home/matt/devel/ZoF/module/zfs/arc.c:1732
1732			EQUIV(!copied, hdr->b_l1hdr.b_freeze_cksum == NULL);
1722				copied = B_TRUE;
1723				break;
1724			}
1725		}
1726	
1727		/*
1728		 * There were no decompressed bufs, so there should not be a
1729		 * checksum on the hdr either.
1730		 */
1731		if (zfs_flags & ZFS_DEBUG_MODIFY)
1732			EQUIV(!copied, hdr->b_l1hdr.b_freeze_cksum == NULL);
1733	
1734		return (copied);
1735	}
1736	
1737	/*
1738	 * Allocates an ARC buf header that's in an evicted & L2-cached state.
1739	 * This is used during l2arc reconstruction to make empty ARC buffers
1740	 * which circumvent the regular disk->arc->l2arc path and instead come
1741	 * into being in the reverse order, i.e. l2arc->arc.
$1 = 0
$2 = (zio_cksum_t *) 0xfffff80007617520
quit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grwilson poke

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check to see if the dbuf that is passed into arc_untransform is the same as dbuf->db_dnode_handle->dnh_dnode->dn_dbuf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#17 0xffffffff824278fe in dmu_buf_hold (os=0xfffff8000d530800, object=1, offset=0, tag=0xffffffff8277e1f6, 
    dbp=0xfffffe08721ab3d8, flags=1) at /home/matt/devel/ZoF/module/zfs/dmu.c:243
243			err = dbuf_read(db, NULL, db_flags);
(kgdb) down
#16 0xffffffff82403fb4 in dbuf_read (db=0xfffff80187b6a4c8, zio=0x0, flags=10)
    at /home/matt/devel/ZoF/module/zfs/dbuf.c:1637
1637				err = arc_untransform(db->db_buf, spa, &zb, B_FALSE);
(kgdb) list
1632				zbookmark_phys_t zb;
1633	
1634				SET_BOOKMARK(&zb, dmu_objset_id(db->db_objset),
1635				    db->db.db_object, db->db_level, db->db_blkid);
1636				dbuf_fix_old_data(db, spa_syncing_txg(spa));
1637				err = arc_untransform(db->db_buf, spa, &zb, B_FALSE);
1638				dbuf_set_data(db, db->db_buf);
1639			}
1640			mutex_exit(&db->db_mtx);
1641			if (err == 0 && prefetch) {
(kgdb) p db->db_dnode_handle->dnh_dnode->dn_dbuf
$3 = (struct dmu_buf_impl *) 0xfffff80187b6a660
(kgdb) p db
$4 = (dmu_buf_impl_t *) 0xfffff80187b6a4c8

hdr->b_l1hdr.b_freeze_cksum != NULL) {
zio_cksum_t cksum;
fletcher_2_native(buf->b_data, arc_buf_size(buf), NULL,
&cksum);
ASSERT0(memcmp(&cksum, hdr->b_l1hdr.b_freeze_cksum,
sizeof (zio_cksum_t)));
}
}
return (copied);
}

Expand Down
3 changes: 2 additions & 1 deletion module/zfs/dbuf_stats.c
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,8 @@ dbuf_stats_hash_table_data(char *buf, size_t size, void *data)

ASSERT3S(dsh->idx, >=, 0);
ASSERT3S(dsh->idx, <=, h->hash_table_mask);
memset(buf, 0, size);
if (size)
buf[0] = 0;

mutex_enter(DBUF_HASH_MUTEX(h, dsh->idx));
for (db = h->hash_table[dsh->idx]; db != NULL; db = db->db_hash_next) {
Expand Down