Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious illegal instructioin on x86_64-unkown-linux-musl #955

Closed
1 task done
NobodyXu opened this issue Aug 1, 2023 · 16 comments
Closed
1 task done

Spurious illegal instructioin on x86_64-unkown-linux-musl #955

NobodyXu opened this issue Aug 1, 2023 · 16 comments

Comments

@NobodyXu
Copy link
Contributor

NobodyXu commented Aug 1, 2023

Duplicates

  • I have searched the existing issues

Current behavior 😯

cargo-binstall, which uses gix v0.50.1, sometimes get killed by illegal instructions on this CI and also this one:

+ ./cargo-binstall binstall --force --git file:///tmp/tmp.iyk2axDWcq --no-confirm cargo-binstall
 INFO resolve: Resolving package: 'cargo-binstall'
 INFO Cloning::receiving pack: Enumerating objects → 3.0 objects
git.sh: line 39: 24033 Illegal instruction     (core dumped) "./$1" binstall --force --git "file://$GIT" --no-confirm cargo-binstall

I suspect this has something to do with max-performance, it could be that the use of assemblies in sha1-asm or libz-ng caused this problem.

It always happens when cloning from a local git repository using file:///, so I suppose it also has something to do with it.

Expected behavior 🤔

No response

Steps to reproduce 🕹

No response

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 1, 2023

Oops it just happens again

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 1, 2023

It also happens on x86_64-unknown-linux-gnu.

@Byron
Copy link
Member

Byron commented Aug 1, 2023

That's interesting as it doesn't reproduce (or is nothing I encountered) in this CI yet. Thus another possible cause may be be miscompilation, in addition to assembly as potential cause.

If the issue is miscompilation, then adding rustflags = ["-Ctarget-cpu=native"] (or removing it) might affect codegen and maybe resolves the issue that way.

If assembly is the issue, you could try to disable one or the other performance option, using either sha1-asm only or only libz-ng (or another zlib implementation for that matter). To do that, you would go with max-performance-safe, add a dependency to gix-features and add --features = ["fast-sha1"] or --features = ["zlib-ng"] (or another backend) to see issue is fixed that way.

Please note that I think for performance, a great zlib implementation is more important than a great hash implementation, even though I didn't yet run tests to validate my hunch (as a disclaimer).

Thanks for letting me know how it goes.

PS: Maybe another possible issue is some strange interaction with the VM this runs on. After all, illegal instructions should always be run in tight algorithms like SHA1 or zlib-ng, so it should consistently crash. But it doesn't, so might be related to something else entirely. In any case, changing the instructions in the binary might help to fix it merely by not using the maybe special instructions that come with assembly or certain super-optimized C code in zlib-ng.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 1, 2023

Thanks for the swift response!

I will disable sha1-asm and leaving only zlib-ng, since IMO zlib-ng is more battle-tested and it provides more speedup than sha1-asm.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 1, 2023

@Byron I tried disabling sha1-asm but the failure persists

@Byron
Copy link
Member

Byron commented Aug 1, 2023

That's great, as one possibility has been removed. The next to try would be zlib-ng then (while leaving sha1-asm disabled)? Maybe change it to the normal backend to retain some performance. If that still doesn't work, try with max-performance-safe alone, that is pure Rust then.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 1, 2023

Yeah, after I disabled zlib-ng for x86_64-unknown-linux-{gnu, musl}, it no longer happens.

I guess we should report this to upstream.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 1, 2023

After zlib-ng and sha1-asm is disabled, the time for cloning crates.io git-index increases a lot but is still reasonable since we no longer checkout the repository now.

@Byron
Copy link
Member

Byron commented Aug 1, 2023

Great to hear we pin-pointed it! I recommend turning sha1-asm back on to get a little performance back. If the issue reoccurs than we'd also learn something, i.e. it would then be some combination of libraries or some strangeness with codegen maybe.

For now I am tentatively closing this issue as there is nothing we can do here, and follow the related issue instead.
If something changes, this issue should be reopened though.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 2, 2023

Great to hear we pin-pointed it! I recommend turning sha1-asm back on to get a little performance back.

I think the current performance is already good enough, in fact I believe that as long as we don't do a checkout, it will be ok for binstall to even switch back to -Oz.

If the performance becomes a problem again, binstall can employ the same caching schema used in .cargo/registry/index, although I'm not sure how cargo determines the name of the directory and prevent concurrent access to it.

@NobodyXu NobodyXu closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2023
@Byron
Copy link
Member

Byron commented Aug 2, 2023

If the performance becomes a problem again, binstall can employ the same caching schema used in .cargo/registry/index, although I'm not sure how cargo determines the name of the directory and prevent concurrent access to it.

You might be interested in tame-index or crates-index - they make it possible to use the local cargo caches, even though I think tame-index might be better for that as it's better organized.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 2, 2023

You might be interested in tame-index or crates-index - they make it possible to use the local cargo caches, even though I think tame-index might be better for that as it's better organized.

Thanks!

Although I am still a bit hesitant on pulling them, since binstall used to depend on it but was explicitly removed because its dependence on curl and openssl making the final binary much harder to redistribute.

If binstall needs this, I might port this code from crates-index to our crate, although I'm not sure how to use gix to update an existing repository shallowly.

@Byron
Copy link
Member

Byron commented Aug 2, 2023

Although I am still a bit hesitant on pulling them, since binstall used to depend on it but was explicitly removed because its dependence on curl and openssl making the final binary much harder to redistribute.

crates-index and tame-index both use gix, so it's possible to configure it without. tame-index in particular makes it easy to avoid dependencies, I think, and use the cache directly or write it. That would be my first choice, and I am saying that as the (albeit freshly made) maintainer of crates-index.

[..] although I'm not sure how to use gix to update an existing repository shallowly.

gix does the right thing by default and no configuration is needed to maintain shallowness. Otherwise it can be controlled here.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 2, 2023

crates-index and tame-index both use gix, so it's possible to configure it without. tame-index in particular makes it easy to avoid dependencies, I think, and use the cache directly or write it. That would be my first choice, and I am saying that as the (albeit freshly made) maintainer of crates-index.

Thanks, I would take note of this in case we need this feature in binstall.

gix does the right thing by default and no configuration is needed to maintain shallowness. Otherwise it can be controlled here.

I actually don't know how to update a cloned repository in gix, although based on your reply it seems like I just need to provide it with the existing paths and it would work.

@Byron
Copy link
Member

Byron commented Aug 2, 2023

I actually don't know how to update a cloned repository in gix, although based on your reply it seems like I just need to provide it with the existing paths and it would work.

Fetching works by using a remote and is quite straightforward.

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Aug 2, 2023

Fetching works by using a remote and is quite straightforward.

Thanks!

I will also take note of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants