-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use _by_dnode() routines for additional performance wins #4802
Comments
Add *_by_dnode() routines for accessing objects given their dnode_t *, this is more efficient than accessing the object by (objset_t *, uint64_t object). This change converts some but not all of the existing consumers. As performance-sensitive code paths are discovered they should be converted to use these routines. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com> Closes #5534 Issue #4802
@ahrens and @behlendorf , Could we use "dmu_tx_hold_write_by_dnode" to replace "dmu_tx_hold_write" in zvol_write? |
@wangdbang we can. We can also make the same change to |
@behlendorf ,thank you for reply, another question, Could the tx(dmu_txt_t) be a member of the zvol? avoid to call the dmu_tx_create per zvol_write. the same way, if we use "dmu_tx_hold_write_by_dnode" to replace "dmu_tx_hold_write", Could the dnode be a member of the zvol? is that possible? |
@wangdbang we should be able to safely add the dnode_t for the DMU_OST_ZVOL object to zvol_state_t as long and we keep a hold on it. A similar thing is already done for the bonus dbuf, zv->zv->dbuf. The tx is another story. We need to create a new one for every write for the ZIL to function as designed. By chance have you done any profiling of how expensive the dnode lookup is here? |
dmu_tx_hold_write_by_dnode() from zvol should be possible. dmu_tx_create/assign/commit() for each write is fundamental (not just for the ZIL). |
@behlendorf ,thanks, I did not evaluate the cost for dnode lookup, just a suggestion to add the dnode to zvol_state_t as a member from code view sides to avoid dnode lookup per zol_write. i'm a new user of ZOL, and study the code as a beginner. thanks again. |
@ahrens and @behlendorf , if you have a patch, i would like to verify it. |
@ahrens and @behlendorf , just evaluate zvol_write performance, My rough thought like this, add a dnode point variable zv_dn into zvol_state, assign the value through dnode_hold at first call in zvol_write, and then call dmu_tx_hold_write_by_dnode to replace the dmu_tx_hold_write, is that possible? if it is, i'll try it. |
@wangdbang adding When benchmarking the writes I'd suggest using a small block size since that's the workload we'd expect to benefit most from this optimization. Thanks for looking at this! |
@behlendorf ,thanks for your suggestion, i'll try it, and share the result if have any progress. |
@behlendorf and @ahrens , here is the rough patch base on v0.7.0-rc3
|
Here is the 8k test result with orion.
turn 2nd: turn 3rd: add the dnode to zvol_state to avoid dnode_hold per zvol_write calling, apply the patch. turn 2nd: turn 3rd: |
@ahrens and @behlendorf ,It seems that the performance has a little reduce, would you like to give some suggestion? thanks. |
Add *_by_dnode() routines for accessing objects given their dnode_t *, this is more efficient than accessing the object by (objset_t *, uint64_t object). This change converts some but not all of the existing consumers. As performance-sensitive code paths are discovered they should be converted to use these routines. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com> Closes openzfs#5534 Issue openzfs#4802
Add *_by_dnode() routines for accessing objects given their dnode_t *, this is more efficient than accessing the object by (objset_t *, uint64_t object). This change converts some but not all of the existing consumers. As performance-sensitive code paths are discovered they should be converted to use these routines. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com> Closes openzfs#5534 Issue openzfs#4802
@wli5, should i use the code that your give the link? |
@wangdbang I didn't contribute to that code. |
The zvol number looks close enough that it looks like within marginal of error. One thing to note is that we already use zv_dbuf to hold the bonus buffer for similar reason, so if we switch to zv_dn, we should get rid of zv_dbuf. |
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions. |
#4641 recommends adding
*_by_dnode()
routines for accessing objects given theirdnode_t*
, which is more efficient than accessing the object by(objset_t*, uint64_t object)
. We should convert additional performance-sensitive code paths to use these routines as we discover them.For the concurrent file creation microbenchmark, the following additional code paths spend a lot of time in dnode_hold(), and could benefit from conversion:
It may be inconvenient to keep track of the dnode_t to supply in some of these code paths (or others). In that case we should consider implementing a "dnode cache": a global hashtable which maps from
<objset_t*, objectID>
to<dnode_t*>
. This would avoid contention on the dn_mtx for dnodes that have been recently accessed. In terms of cost/reward, the cost of fixing this is medium, and the benefit is medium.Analysis sponsored by Intel Corp.
The text was updated successfully, but these errors were encountered: