Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Better support for pre-SSE4.2 CPUs #15589

Closed
ansarizafar opened this issue May 2, 2017 · 7 comments
Closed

build: Better support for pre-SSE4.2 CPUs #15589

ansarizafar opened this issue May 2, 2017 · 7 comments
Assignees
Milestone

Comments

@ansarizafar
Copy link

I am still getting same error which was mentioned in issue #14443
image

@tamird
Copy link
Contributor

tamird commented May 2, 2017

@ansarizafar sorry to hear you're still hitting this error! As mentioned in #14443, the root cause is that our release binaries include RocksDB compiled with SSE4.2, which your machine seems not to support.

I'm going to investigate making our default behaviour better on older hardware. In the meantime, building from source should produce a binary that will run on your machine. Could you try that?

@tamird tamird self-assigned this May 2, 2017
@tamird tamird added this to the 1.1 milestone May 2, 2017
@bdarnell bdarnell changed the title Error: exit status 2, Failed running "start" build: Better support for pre-SSE4.2 CPUs Jun 22, 2017
@petermattis
Copy link
Collaborator

I think there is something relatively straightforward that can be done here: very early in the lifecycle of the process we can detect if the CPU supports SSE4.2 and exit with a detailed error message if it does not. The code to detect whether the CPU supports SSE4.2 already exists in RocksDB util/crc32c.cc.

@benesch
Copy link
Contributor

benesch commented Jul 5, 2017

FWIW, if that RocksDB code worked as intended, we wouldn't have this problem, since we'd gracefully fall back to the slow, non-SSE4.2 CRC32 implementation. @tamird and I discussed at some point, and the conclusion was that compiling all of RocksDB with -msse4.2 causes the autovectorizer to kick in and use SSE4.2 instructions outside of the fast CRC32 implementation, and thus outside of the cpuid check. I believe we'd need to move the fast implementations to their own translation unit, and compile only that file with -msse4.2.

Would we prefer to run on these old CPUs, albeit with a slow checksum? If so, we could upstream the above fix, rather than failing fast.

@petermattis
Copy link
Collaborator

Would we prefer to run on these old CPUs, albeit with a slow checksum? If so, we could upstream the above fix, rather than failing fast.

Yes, running on old CPUs, albeit with a slow checksum would be preferable. I wonder if there is a performance hit to not compiling the rest of RocksDB with -msse4.2.

@bdarnell
Copy link
Contributor

bdarnell commented Jul 5, 2017

See also facebook/rocksdb#2488 for an upstream issue.

GCC has a -mcrc32 flag, which looks like it would enable the CRC32 functions for explicit use (guarded by the runtime check) without turning on auto-vectorization throughout the codebase.

The compiler is using SSE4.2 instructions elsewhere, so it must expect some benefit. Besides the CRC32 function, the rest of SSE4.2 appears to be optimized string-comparison instructions. It's unclear how much difference this makes (I also assume that Go is not doing this for the Go parts of our codebase). We could also try -msse3 to still get some auto-vectorization while setting the minimum bar lower.

@benesch
Copy link
Contributor

benesch commented Jul 8, 2017

Clang doesn't support -mcrc32, which is rather unfortunate. Not a dealbreaker, just something to be aware of.

@benesch
Copy link
Contributor

benesch commented Jul 8, 2017

What I think we want is to apply the target("sse4.2") function attribute to only the FastCRC function.

benesch added a commit to benesch/cockroach that referenced this issue Jul 10, 2017
Previously, RocksDB was compiled with `-msse4.2` across the board. The
compiler, unsurprisingly took the liberty of actually emitting SSE4.2
instructions, which could cause `SIGILL` crashes on pre-SSE4.2 CPUs
(i.e, CPUs released before ca. November 2008).

The only part of RocksDB that really benefits from SSE4.2 instructions,
however, is the CRC32C checksum code, which can use the hardware CRC32C
instructions provided by SSE4.2. This commit simply removes `-msse4.2`
from the RocksDB compile flags, and renables it for the FastCRC32
function only via a GCC attribute. Since FastCRC32 is already properly
guarded by a CPUID check, which falls back to a slower implementation at
runtime if the CPU does not support SSE4.2, this should get us nearly
all of the performance benefits of compiling with `-msse4.2` while still
supporting pre-SSE4.2 CPUs.

Fixes cockroachdb#15589.
@tamird tamird assigned benesch and unassigned tamird Jul 11, 2017
benesch added a commit to benesch/cockroach that referenced this issue Jul 11, 2017
Previously, RocksDB was compiled with `-msse4.2` across the board. The
compiler, unsurprisingly took the liberty of actually emitting SSE4.2
instructions, which could cause `SIGILL` crashes on pre-SSE4.2 CPUs
(i.e, CPUs released before ca. November 2008).

The only part of RocksDB that really benefits from SSE4.2 instructions,
however, is the CRC32C checksum code, which can use the hardware CRC32C
instructions provided by SSE4.2. This commit simply removes `-msse4.2`
from the RocksDB compile flags, and renables it for the FastCRC32
function only via a GCC attribute. Since FastCRC32 is already properly
guarded by a CPUID check, which falls back to a slower implementation at
runtime if the CPU does not support SSE4.2, this should get us nearly
all of the performance benefits of compiling with `-msse4.2` while still
supporting pre-SSE4.2 CPUs.

To ensure we don't lose the benefits of vectorization provided by SSE,
SSE2, and SSE3, we compile all of RocksDB with `-msse3`, which is
supported by most x86 CPUs released since 2004.

Fixes cockroachdb#15589.
rail added a commit to rail/cockroach that referenced this issue Jan 10, 2023
Previously,  the bincheck test started failing after upgrading the linux
build system to a newer GLIBC and after we changed the way we stop the
server.

The linux bincheck test was relying on running the cockroach binary in a
qemu-based VM, using a custom kernel.

Now that, we don't use rocksdb and don't have SSE issues (see cockroachdb#15589),
we can simplify the test and run the binary without the VM layer.

This PR also fixes the issue, where the busybox `kill` command can only
handle positional arguments.

We try our best to kill the server, but in some cases the PID changes
and the whole test fail. Ignore the kill exit code and let the build
agent to kill the process.

Epic: none
Fixes: RE-271
Release note: None
craig bot pushed a commit that referenced this issue Jan 11, 2023
94975: release: fix bincheck failures r=rail a=rail

Previously,  the bincheck test started failing after upgrading the linux build system to a newer GLIBC and after we changed the way we stop the server.

The linux bincheck test was relying on running the cockroach binary in a qemu-based VM, using a custom kernel.

Now that, we don't use rocksdb and don't have SSE issues (see #15589), we can simplify the test and run the binary without the VM layer.

This PR also fixes the issue, where the busybox `kill` command can only handle positional arguments.

We try our best to kill the server, but in some cases the PID changes and the whole test fail. Ignore the kill exit code and let the build agent to kill the process.

Epic: none
Fixes: RE-271
Release note: None

Co-authored-by: Rail Aliiev <rail@iqchoice.com>
rail added a commit to rail/cockroach that referenced this issue Jan 11, 2023
Previously,  the bincheck test started failing after upgrading the linux
build system to a newer GLIBC and after we changed the way we stop the
server.

The linux bincheck test was relying on running the cockroach binary in a
qemu-based VM, using a custom kernel.

Now that, we don't use rocksdb and don't have SSE issues (see cockroachdb#15589),
we can simplify the test and run the binary without the VM layer.

This PR also fixes the issue, where the busybox `kill` command can only
handle positional arguments.

We try our best to kill the server, but in some cases the PID changes
and the whole test fail. Ignore the kill exit code and let the build
agent to kill the process.

Epic: none
Fixes: RE-271
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants