Make ZIL operations on zvols use _by_dnode routines #6058

ryao · 2017-04-22T17:39:58Z

Description

This continues what was started in
0eef1bd by fully converting zvols to
avoid unnecessary dnode_hold() calls. This saves a small amount of CPU
time and slightly improves latencies onsynchronous operations on zvols.

Motivation and Context

I am working on zvol performance at work. While working on larger changes, I noticed unnecessary overhead that is easy to eliminate. Eliminating it brings some small latency improvements from reduced overhead.

How Has This Been Tested?

It has been tested against prophetstor's internal 0.6.5.9 branch using fio:

fio --name=writeiops --filename=/dev/zvol/e7c61cba0e8e4cec8692570e7c9280bf/test --direct=1 --rw=randwrite --bs=4k --numjobs=4 --iodepth=32 --direct=1 --iodepth_batch=16 --iodepth_batch_complete=16 --runtime=1200 --ramp_time=5 --norandommap --time_based --ioengine=libaio --group_reporting --random_distribution=pareto:0.9

The test hardware is 12x M630-DC SSDs, 256GB of RAM and dual Intel(R) Xeon(R) CPU E5-2630 v4 CPUs (20 cores). I tested on a 500GB zvol. I was careful to wait until the cache was hot enough that we weren't reading metadata before evaluating impact.

The branch is using a variation of #5824 (that predates my first day at my new employer last week), so the numbers are not directly applicable to head, but average latencies dropped by ~15%.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.
All commit messages are properly formatted and contain Signed-off-by.
Change has been approved by a ZFS on Linux member.

mention-bot · 2017-04-22T17:40:00Z

@ryao, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @ahrens and @tuxoko to be potential reviewers.

ryao · 2017-04-26T17:14:34Z

I am working on bigger changes, but this should help #4880.

ryao · 2017-04-26T21:21:50Z

@behlendorf The failure is #5195, which is a pre-existing issue. It is unrelated to the patch here.

Also, preliminary testing at Prophetstor with a SLOG device shows a 30% improvement in IOPS from this patch alone.

Edit: I neglected to notice that I had also backported the changes to use by_dnode for zvol_write and zvol_read. They had made little difference in my initial tests (without a slog), but I kept them in my local branch. The dnode_hold in dmu_write_bio definitely would have had some contention with the dnode_hold in the ZIL path. I do not have access to the machine with SLOG that was used at work to isolate this change from that, but I imagine the actual improvement is somewhat less than 30%. Nevertheless, this is a step in the right direction. I believe that the improvement comes mainly from enabling ZIL batch processing to iterate faster.

ryao · 2017-04-27T17:44:15Z

@sempervictus If you have a synchronous IO intensive workload, this might give you a small boost. You might want to try testing it.

sempervictus · 2017-04-27T18:20:42Z

Oh yeah, going into the weekend stack for sure. Thank you sir :-)

behlendorf

@ryao definitely progress is the right direction! There was discussion in PR4802 of taking this farther and replacing the dbuf_t with a dnode_t in the zvol_state_t. This would remove the need for the _by_dbuf wrappers, and it might buy you a little more performance since you'll be able to eliminate the DB_DNODE_ENTER/EXIT.

behlendorf

@ryao let's do this by switching the dbuf_t to a dnode_t in the zvol_state_t and converting all the existing holds.

behlendorf · 2017-06-02T14:01:37Z

@ryao if you want these improvements to make 0.7.0 can you address the review feedback. We're trying to wrap up the release.

ryao · 2017-06-09T05:30:59Z

@behlendorf I like the idea of switching to the _by_dnode routines. This is now refreshed.

behlendorf

Thanks, looks good. This ends up being a nice simplification too.

behlendorf · 2017-06-09T18:03:02Z

@ryao one last rebase to resolve a merge conflict and this should be good to go.

ryao · 2017-06-09T23:59:29Z

@behlendorf It is rebased.

behlendorf · 2017-06-10T00:04:11Z

@ryao thanks, I'll get this merged after the buildbot finishes its work.

sempervictus · 2017-06-10T05:00:11Z

Those test failures dont look too good, crash seems to be coming from:

[ 4304.650875] VERIFY(dn->dn_type != DMU_OT_NONE) failed
[ 4304.655793] PANIC at dbuf.c:2303:dbuf_create()

which seems related to this :)

behlendorf · 2017-06-10T19:10:03Z

@sempervictus I can see why you thought this might be related since it paniced in dbuf_create(), but it's not. The panic occurred in the zil code for a filesystem and not a zvol, it's more likely this is related to 1b7c1e5 which was recently merged. The link for the log in the previous comment.

sempervictus · 2017-06-10T20:38:59Z

Yeah... That zil PR should probably be backed out or fixed ASAP - my builders are all OK, and first ztest is usually fine, but zloops are all hanging with no stack trace. Often on fletcher inc test according to log (well, last output is fletcher, next one would be fletcher inc). Something in there is not playing well on Linux in ways more subtle than initial automated testing could detect. It also conflicts in what appear to be key areas with @ryao's zil PR, which as I understand it has been beaten up on Linux pretty heavily. Maybe we switch em out and do the conflict resolution from upstream to zol as opposed to making @ryao change his code to match something that's broken?

behlendorf · 2017-06-10T23:16:18Z

@sempervictus are you able to reproduce those zloop failures on the current master branch as of commit 88c3012. If so can you determine as of which commit you started seeing them.

behlendorf · 2017-06-10T23:20:11Z

module/zfs/dmu.c

@@ -965,6 +965,9 @@ int
 dmu_read_by_dnode(dnode_t *dn, uint64_t offset, uint64_t size, void *buf,
    uint32_t flags)
 {
+	if (size == 0)
+		return (0);


Why was this added? I don't see why it should be needed here or in the write path.

The previous dmu_read_uio_dbuf() function did it. Upon review, it turns out to be unnecessary. I'll remove it.

This continues what was started in 0eef1bd by fully converting zvols to avoid unnecessary dnode_hold() calls. This saves a small amount of CPU time and slightly improves latencies of operations on zvols. Signed-off-by: Richard Yao <richard.yao@prophetstor.com>

ryao · 2017-06-13T14:04:23Z

@behlendorf The failures seem to have disappeared. Why they appeared when I pushed this initally is something of a mystery, but we are passing now. I have made all of the changes you requested. It should be fine to merge now.

behlendorf · 2017-06-13T16:15:39Z

Thanks @ryao, I'll get this merged.

It turns out the original failure reported by @sempervictus wasn't introduced by this PR. I've seen it one other time against master. I believe it was either introduced by commit 1b7c1e5 or possibly 38240eb. As soon as we can determine which commit actually introduced it we're going to want to either revert that change or if it's straight forward fix it.

I also resubmitted the two failed test runs for the updated version of this patch since they hit unrelated issues. I just wanted to confirm that before merging.

ryao force-pushed the zvol_zil branch from 4c83b66 to 7687327 Compare April 26, 2017 17:13

ryao changed the title ~~Do not do a dnode hold in zvol_get_data~~ Switch synchronous IO on zvols to use _by_dbuf routines Apr 26, 2017

ryao force-pushed the zvol_zil branch 5 times, most recently from f489809 to 33c316d Compare April 26, 2017 17:46

ryao changed the title ~~Switch synchronous IO on zvols to use _by_dbuf routines~~ Make ZIL operations on zvols use _by_dbuf routines Apr 26, 2017

ryao mentioned this pull request Apr 26, 2017

asynchronous zvol minor operations cause deadlock in buildbot #6075

Closed

ryao added OpenZFS Review Type: Performance Performance improvement or performance problem labels Apr 26, 2017

behlendorf requested changes Apr 28, 2017

View reviewed changes

behlendorf requested changes May 2, 2017

View reviewed changes

behlendorf added the Status: Work in Progress Not yet ready for general review label May 2, 2017

ryao force-pushed the zvol_zil branch from 33c316d to a1d2461 Compare June 9, 2017 05:30

ryao mentioned this pull request Jun 9, 2017

OpenZFS 7578 - Fix/improve some aspects of ZIL writing. #6191

Merged

7 tasks

ryao force-pushed the zvol_zil branch 5 times, most recently from 0efd61c to 3f8d153 Compare June 9, 2017 09:28

behlendorf approved these changes Jun 9, 2017

View reviewed changes

behlendorf removed the Status: Work in Progress Not yet ready for general review label Jun 9, 2017

ryao force-pushed the zvol_zil branch from 3f8d153 to 3c1805d Compare June 9, 2017 23:54

ryao changed the title ~~Make ZIL operations on zvols use _by_dbuf routines~~ Make ZIL operations on zvols use _by_dnode routines Jun 10, 2017

behlendorf requested changes Jun 10, 2017

View reviewed changes

ryao force-pushed the zvol_zil branch from 3c1805d to b560760 Compare June 12, 2017 21:14

ryao force-pushed the zvol_zil branch from b560760 to 1aa1b23 Compare June 12, 2017 21:18

behlendorf approved these changes Jun 13, 2017

View reviewed changes

behlendorf merged commit 5228cf0 into openzfs:master Jun 13, 2017

behlendorf mentioned this pull request Nov 11, 2020

use _by_dnode() routines for additional performance wins #4802

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ZIL operations on zvols use _by_dnode routines #6058

Make ZIL operations on zvols use _by_dnode routines #6058

ryao commented Apr 22, 2017 •

edited

Loading

mention-bot commented Apr 22, 2017

ryao commented Apr 26, 2017

ryao commented Apr 26, 2017 •

edited

Loading

ryao commented Apr 27, 2017

sempervictus commented Apr 27, 2017 via email

behlendorf left a comment

behlendorf left a comment

behlendorf commented Jun 2, 2017

ryao commented Jun 9, 2017 •

edited

Loading

behlendorf left a comment

behlendorf commented Jun 9, 2017

ryao commented Jun 9, 2017

behlendorf commented Jun 10, 2017

sempervictus commented Jun 10, 2017

behlendorf commented Jun 10, 2017

sempervictus commented Jun 10, 2017 via email

behlendorf commented Jun 10, 2017

behlendorf Jun 10, 2017

ryao Jun 12, 2017 •

edited

Loading

ryao commented Jun 13, 2017

behlendorf commented Jun 13, 2017

Make ZIL operations on zvols use _by_dnode routines #6058

Make ZIL operations on zvols use _by_dnode routines #6058

Conversation

ryao commented Apr 22, 2017 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

mention-bot commented Apr 22, 2017

ryao commented Apr 26, 2017

ryao commented Apr 26, 2017 • edited Loading

ryao commented Apr 27, 2017

sempervictus commented Apr 27, 2017 via email

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf commented Jun 2, 2017

ryao commented Jun 9, 2017 • edited Loading

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf commented Jun 9, 2017

ryao commented Jun 9, 2017

behlendorf commented Jun 10, 2017

sempervictus commented Jun 10, 2017

behlendorf commented Jun 10, 2017

sempervictus commented Jun 10, 2017 via email

behlendorf commented Jun 10, 2017

behlendorf Jun 10, 2017

Choose a reason for hiding this comment

ryao Jun 12, 2017 • edited Loading

Choose a reason for hiding this comment

ryao commented Jun 13, 2017

behlendorf commented Jun 13, 2017

ryao commented Apr 22, 2017 •

edited

Loading

ryao commented Apr 26, 2017 •

edited

Loading

ryao commented Jun 9, 2017 •

edited

Loading

ryao Jun 12, 2017 •

edited

Loading