Skip to content

Commit

Permalink
Make til::color's COLORREF conversion more optimal (#7619)
Browse files Browse the repository at this point in the history
Clang (10) has no trouble optimizing the COLORREF conversion operator to
a simple 32-bit load with mask (!) even though it's a series of bit
shifts across multiple struct members.

MSVC (19.24) doesn't make the same optimization decision, and it emits
three 8-bit loads and some shifting.

In any case, the optimization only applies at -O2 (clang) and above.

In this commit, we leverage the spec-legality of using unions for type
conversions and the overlap of four uint8_ts and a uint32_t to make the
conversion very obvious to both compilers.

x86_64 msvc | O0 | O1 | O2
------------|----|----|--------------------
shifts      | 12 | 11 | 11 (fully inlined)
union       |  5 |  1 |  1 (fully inlined)

x86_64 clang | O0 | O1 | O2 + O3
-------------|----|----|--------------------
shifts       | 14 |  5 |  1 (fully inlined)
union        |  9 |  3 |  1 (fully inlined)

j4james brought up some concerns about til::color's minor wastefulness
in #7578 (comment).

This is a clear, simple transformation that saves us a few instructions
in a relatively common case, so I'm accepting a micro-optimization even
though we don't have data showing this to be a hot spot.
  • Loading branch information
DHowett authored Sep 14, 2020
1 parent 88d1527 commit c17f448
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 7 deletions.
1 change: 1 addition & 0 deletions .github/actions/spell-check/expect/expect.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ ABANDONFONT
ABCDEFGHIJKLMNO
ABCG
abf
abgr
abi
ACCESSTOKEN
acec
Expand Down
33 changes: 26 additions & 7 deletions src/inc/til/color.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,27 @@ namespace til // Terminal Implementation Library. Also: "Today I Learned"
#pragma warning(disable : 26472)
struct color
{
uint8_t r, g, b, a;
// Clang (10) has no trouble optimizing the COLORREF conversion operator, below, to a
// simple 32-bit load with mask (!) even though it's a series of bit shifts across
// multiple struct members.
// CL (19.24) doesn't make the same optimization decision, and it emits three 8-bit loads
// and some shifting.
// In any case, the optimization only applies at -O2 (clang) and above.
// Here, we leverage the spec-legality of using unions for type conversions and the
// overlap of four uint8_ts and a uint32_t to make the conversion very obvious to
// both compilers.
union
{
struct
{
#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ // Clang, GCC
uint8_t a, b, g, r;
#else
uint8_t r, g, b, a;
#endif
};
uint32_t abgr;
};

constexpr color() noexcept :
r{ 0 },
Expand Down Expand Up @@ -40,16 +60,13 @@ namespace til // Terminal Implementation Library. Also: "Today I Learned"

#ifdef _WINDEF_
constexpr color(COLORREF c) :
r{ static_cast<uint8_t>(c & 0xFF) },
g{ static_cast<uint8_t>((c & 0xFF00) >> 8) },
b{ static_cast<uint8_t>((c & 0xFF0000) >> 16) },
a{ 255 }
abgr{ static_cast<uint32_t>(c | 0xFF000000u) }
{
}

operator COLORREF() const noexcept
{
return static_cast<COLORREF>(r) | (static_cast<COLORREF>(g) << 8) | (static_cast<COLORREF>(b) << 16);
return static_cast<COLORREF>(abgr & 0x00FFFFFFu);
}
#endif

Expand Down Expand Up @@ -137,7 +154,7 @@ namespace til // Terminal Implementation Library. Also: "Today I Learned"

constexpr bool operator==(const til::color& other) const
{
return r == other.r && g == other.g && b == other.b && a == other.a;
return abgr == other.abgr;
}

constexpr bool operator!=(const til::color& other) const
Expand Down Expand Up @@ -171,6 +188,8 @@ namespace til // Terminal Implementation Library. Also: "Today I Learned"
#pragma warning(pop)
}

static_assert(sizeof(til::color) == sizeof(uint32_t));

#ifdef __WEX_COMMON_H__
namespace WEX::TestExecution
{
Expand Down

0 comments on commit c17f448

Please sign in to comment.