Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IntelLLVM support #536

Merged
merged 34 commits into from
Jul 3, 2024

Conversation

AlexanderRichert-NOAA
Copy link
Contributor

@AlexanderRichert-NOAA AlexanderRichert-NOAA commented Nov 10, 2023

This PR adds IntelLLVM (OneAPI) support, including a test workflow. No further changes to Intel flags, etc. were needed except that one single unit test (test_c_interface_2) was segfaulting, and was resolved by adding some explicit "int" conversions in the c2f interface.

Fixes #538

@AlexanderRichert-NOAA AlexanderRichert-NOAA force-pushed the ci_updates_nov2023 branch 2 times, most recently from dd0e3ea to 1f7bb78 Compare November 10, 2023 01:14
@edwardhartnett
Copy link
Contributor

Can you add an issue for this please, assigned to the next release? And add it to the release notes for the next release...

edwardhartnett
edwardhartnett previously approved these changes Nov 10, 2023
@AlexanderRichert-NOAA
Copy link
Contributor Author

AlexanderRichert-NOAA commented Nov 13, 2023

@jbathegit @jack-woollen I'm going to see if I can also take care of the MacOS/Python issue in this PR, but meanwhile-- In order to fix segfaults in one of the CI tests for Intel OneAPI (all the others were fine out of the gate), I had to modify src/bufr_c2f_interface.F90. Specifically, it for some reason required explicitly converting some of the c_int-type intent=in params to regular fortran int's for a number of the subroutine calls. I checked and there was no size difference between the two types (both gave SIZEOF()=4), that's as far as my investigation went though. I'm not sure if this reflects a compiler bug or an issue with the code, but in any case let me know if this looks sensible.

Edit: I'll do the python/MacOS issue separately, as it appears that will be a bit of a rabbit hole.

@jbathegit
Copy link
Collaborator

jbathegit commented Nov 22, 2023

@AlexanderRichert-NOAA I really appreciate you working on this, since we are going to need to be able to support and test the library under the newer Intel compilers in the very near future! #539 has now been merged as you requested, but now the Intel tests are failing.

On a related note, I don't really understand why those int() wrappers should be needed in bufr_c2f_interface.F90, especially since:

  1. You apparently didn't also need them to wrap similar c_int value arguments for calls to iupb (see iupb_f) or istdesc (see istdesc_f) within the same source file. Those particular functions aren't called by test_c_interface_2, but they are called from other C codes in the library (i.e. from utils/xbfmg.c and src/restd.c, respectively), so if those wrappers are really needed then why didn't those codes similarly segfault in test/test_scripts/test_xbfmg.sh, etc.?
  2. It also doesn't make sense to me why you'd also need an int() wrapper around the argument to ibfms (see ibfms_f), because that argument is supposed to be a c_double, so I would think that downgrading that argument to an int would completely muck up that interface right from the get-go.

Or am I missing something here?

@jbathegit
Copy link
Collaborator

Also, and since this PR introduces separate new CI tests for Intel / Intel (CC=icx FC=ifx) and Intel / Intel (CC=icc FC=ifort), then why do we still have the separate old CI test for Intel, and which apparently is still stuck waiting to run?

@AlexanderRichert-NOAA
Copy link
Contributor Author

I got the Intel tests working, but the MacOS one is sporadically failing so I'm working on that (it appears to be running some steps multiple times which is... very confusing).

I'll have to look into the int() stuff some more, maybe see if it's some kind of issue with the compiler. I may try to come up with a test case and run it by the Intel OneAPI developers. As for question 2, I may have used the wrong function there so I'll look at that in the process.

@AlexanderRichert-NOAA
Copy link
Contributor Author

The "Intel" test is showing up in the list there because it's a required test therefore it's "expecting" it to run even though it's been removed from the CI. Once this PR is good to go, we will just need to go into the branch protection settings, remove the old test, and require the new Intel tests instead.

@jbathegit
Copy link
Collaborator

Hey @AlexanderRichert-NOAA, I can clearly see how much effort you've been putting into trying to get this resolved. Please know that it's much appreciated, and please let me know if I can be of any further help!

@AlexanderRichert-NOAA AlexanderRichert-NOAA mentioned this pull request Nov 30, 2023
@AlexanderRichert-NOAA
Copy link
Contributor Author

@jbathegit the issue with the C bindings/data sizes is a bug in the OneAPI Fortran compiler. The Intel folks weren't able to give a timeframe on a fix, so I'm leaning toward closing this for now and circling back when there's a working version of the compiler. On the other hand, if you think it would be useful to have this support sooner, or for that matter if we needed to transition to OneAPI before the compiler is fixed, then I can always restructure things in order to have minimal impact on the code as far as the other compilers (i.e., use a preprocessor def to only apply a fix when compiling with Intel OneAPI).

No word yet from the f2py developers about that bug, so for now we just have to avoid Python 3.12.

@jbathegit
Copy link
Collaborator

jbathegit commented Dec 6, 2023

Thanks Alex, and yes I'm fine with closing this PR and then just re-opening a new one to address #538 once we have a working version of the compiler.

@AlexanderRichert-NOAA
Copy link
Contributor Author

Closing while waiting for OneAPI and f2py fixes

@AlexanderRichert-NOAA
Copy link
Contributor Author

@jbathegit @edwardhartnett now that OneAPI 2024.2 is out, this PR is good to go on my end. If/when it looks good, please squash-and-merge so we don't put my 10 million debugging commits into develop :)

@jbathegit
Copy link
Collaborator

This looks fine @AlexanderRichert-NOAA, and thanks again for working on this!

Per the earlier discussion (see Nov 22, 2023 above) I was able to go in and remove the old "Intel" test from the develop branch protection settings, but for whatever reason I don't see how to now add "Intel / Intel (CC=icx FC=ifx)" and "Intel / Intel (CC=icc FC=ifort)" as new required tests. I tried doing a search on "Intel" in the status checks search bar to try and add them in that way, but for whatever reason it keeps popping me back out to the branch protection rules main page when I try to do that(?)

Either way, I can't merge this PR until this is resolved, because otherwise the merging is blocked. If I'm missing something here please let me know - thanks!

@jbathegit jbathegit merged commit f48d8c7 into NOAA-EMC:develop Jul 3, 2024
10 checks passed
@jbathegit
Copy link
Collaborator

OK once the system refreshed (after I removed "Intel" as a required status check), I could now do the squash+merge for this PR. However, I think we should still have the two new Intel checks as "Required" rather than just b/c we happen to have an Intel.yml in the repository under .github/workflows. But I still can't figure out how to do that - here's what I see on that page:

Capture

So how do I now add the two new "Intel / Intel (CC=icx FC=ifx)" and "Intel / Intel (CC=icc FC=ifort)" as "Required" checks?

@AlexanderRichert-NOAA
Copy link
Contributor Author

You should be able to search for them in the search box there, click on the test(s) you want to require, and hit save at the bottom of the page. If the tests aren't showing up let me know and I can take a crack at it.

@jbathegit
Copy link
Collaborator

Hmm, OK that's exactly what I was trying to do before (i.e. use the search box for "Intel"), but whenever I did that it kept popping me back out to the main page. But now it's working, so maybe I just had to wait for whatever system refresh also removed the earlier merge block(?)

Anyhoo, all good now - and thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add IntelLLVM (OneAPI) support
3 participants