Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Mac M* CPUs #58

Merged
merged 3 commits into from
Nov 13, 2023
Merged

Add support for Mac M* CPUs #58

merged 3 commits into from
Nov 13, 2023

Conversation

timzag
Copy link
Contributor

@timzag timzag commented Nov 8, 2023

I hope this is a reasonable fix, my C++ is not strong. If there's something wrong with this solution I'm happy to try another approach.

Newer Mac hardware does not use x86 processors, and as far as I can tell from the error message attached to #57, <x86intrin.h> isn't functional. https://github.com/DLTcollab/sse2neon seems to contain the necessary SSE intrinsics for Arm/Aarch64. It uses the MIT license.

This PR adds the sse2neon.h header and only includes it when the current architecture is aarch64. Tested the fix on a Mac M2 (aarch64) and an AWS m5.xlarge EC2 instance (x86/64).

Also fixed the attribution on a previous fix in the README, I mistakenly did that on the wrong account.

This PR resolves #57.

Copy link
Owner

@matsui528 matsui528 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As sse2neon.h is a third-party file, can you move it from src/sse2neon.h to src/extern/sse2neon.h ?

@matsui528
Copy link
Owner

This is an excellent PR. Thank you! Although I haven't tested it because I don't have an ARM computer now, I'll merge this anyway after you resolve the above comment.

I've heard that ARM runners will be available next year. After that, it would be great if you could write a CI to test the code on ARM runners.

@timzag
Copy link
Contributor Author

timzag commented Nov 9, 2023

Moved the sse2neon.h file to src/extern/

I should be able to do the corresponding CI work once GitHub makes that available. In the meantime, if there's any manual tests other than make test you think would be appropriate let me know.

@matsui528 matsui528 merged commit c40c70d into matsui528:main Nov 13, 2023
@matsui528
Copy link
Owner

Thanks again! It's merged. But I faced another problem uploading the files to pypi... And I cannot update the version of the library. #62

If you know the solution, please let me know. If not, I'll solve someday..

(Note that this is not ARM-related problem)

@matsui528
Copy link
Owner

@timzag Hi! I just updated the pypi version to v0.2.11, which includes this PR. Can you please run pip install rii and test it on your ARM environment?

@timzag
Copy link
Contributor Author

timzag commented Dec 15, 2023

@matsui528 Sorry about the delay, the installation works fine in my environment (M2 Max, Ventura 13.5.2, Python 3.11) and tests/test_rii.pypasses as well. Looks like this is good to go!

@matsui528
Copy link
Owner

Perfect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compilation fails on Mac M2 CPU
2 participants