-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Mac M* CPUs #58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As sse2neon.h
is a third-party file, can you move it from src/sse2neon.h
to src/extern/sse2neon.h
?
This is an excellent PR. Thank you! Although I haven't tested it because I don't have an ARM computer now, I'll merge this anyway after you resolve the above comment. I've heard that ARM runners will be available next year. After that, it would be great if you could write a CI to test the code on ARM runners. |
Moved the I should be able to do the corresponding CI work once GitHub makes that available. In the meantime, if there's any manual tests other than |
Thanks again! It's merged. But I faced another problem uploading the files to pypi... And I cannot update the version of the library. #62 If you know the solution, please let me know. If not, I'll solve someday.. (Note that this is not ARM-related problem) |
@timzag Hi! I just updated the pypi version to v0.2.11, which includes this PR. Can you please run |
@matsui528 Sorry about the delay, the installation works fine in my environment (M2 Max, Ventura 13.5.2, Python 3.11) and |
Perfect! |
I hope this is a reasonable fix, my C++ is not strong. If there's something wrong with this solution I'm happy to try another approach.
Newer Mac hardware does not use x86 processors, and as far as I can tell from the error message attached to #57,
<x86intrin.h>
isn't functional. https://github.com/DLTcollab/sse2neon seems to contain the necessary SSE intrinsics for Arm/Aarch64. It uses the MIT license.This PR adds the
sse2neon.h
header and only includes it when the current architecture is aarch64. Tested the fix on a Mac M2 (aarch64) and an AWS m5.xlarge EC2 instance (x86/64).Also fixed the attribution on a previous fix in the README, I mistakenly did that on the wrong account.
This PR resolves #57.