CastTable perf tweaks. #34427

VSadov · 2020-04-01T21:55:50Z

A few tweaks to the cast table implementation.

Nothing changed algorithmically. These are small tweaks to help compilers emit better code.

use a small sentinel table, that never contains elements, for the initial table and for flushing (eliminates null check in Get)
reduce indirections and register pressure by operating with a ref to the tableData (first element) instead of the whole table when iterating.
the above also ensures that hashShift is stored at 0 offset off the tableData. (simpler address math in the most common path)

I see better codegen in both managed and C++ code. It results in 12% improvements on directed microbenchmarks such as invoking a method that casts List<string> to IReadOnlyCollection<object> in a loop.

(Ex: 200000000 iterations changes from 894ms to 794ms )

This is not enough to switch ordinary interface and class casts to use cache lookup. Linear scan of interfaces is still faster, at least for common cases involving < 4-6 interfaces. Same goes for looking through bases.

This is still an improvement.

Other things tried (unsuccessfully):

using simpler hash functions
none provided a noticeable gain while collisions generally increased. Current hash seems to be very good for the data in use.
using fixed size table
provided some gains (5% or so depending on scenario). But fixed size implies the max size right from the start. The gains are not big enough, IMO, to justify that.

Considered:

using "colored" pointers for the source and destination handles. (version is embedded in upper bits of the source/destination values).
This would make the table 1.5x denser, since no need for an extra version field and would eliminate any need for synchronization. This is, however, only feasible on 64bit. 32bit would need a separate implementation. I have doubts that gains here would be big, especially big enough to justify the trouble of dual implementations.

VSadov · 2020-04-20T02:48:56Z

Thanks!!

VSadov added NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it area-VM-coreclr labels Apr 1, 2020

dotnet deleted a comment from Dotnet-GitSync-Bot Apr 3, 2020

VSadov added 2 commits April 19, 2020 12:21

No null check in Get.

1de25be

bypass AuxData

a2e421c

VSadov force-pushed the castPerf branch from 2ada618 to a2e421c Compare April 19, 2020 19:22

VSadov changed the title ~~[WIP] CastTable perf tweaks.~~ CastTable perf tweaks. Apr 20, 2020

VSadov removed NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it labels Apr 20, 2020

VSadov marked this pull request as ready for review April 20, 2020 02:11

VSadov requested a review from jkotas April 20, 2020 02:13

jkotas approved these changes Apr 20, 2020

View reviewed changes

VSadov merged commit aa5b204 into dotnet:master Apr 20, 2020

VSadov deleted the castPerf branch April 20, 2020 02:49

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CastTable perf tweaks. #34427

CastTable perf tweaks. #34427

VSadov commented Apr 1, 2020 •

edited

Loading

VSadov commented Apr 20, 2020

CastTable perf tweaks. #34427

CastTable perf tweaks. #34427

Conversation

VSadov commented Apr 1, 2020 • edited Loading

VSadov commented Apr 20, 2020

VSadov commented Apr 1, 2020 •

edited

Loading