Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add callbacks #113

Merged
merged 13 commits into from
Dec 20, 2023
Merged

Add callbacks #113

merged 13 commits into from
Dec 20, 2023

Conversation

nbelakovski
Copy link
Contributor

I'm re-opening #69, but since the original branch was closed I made a new branch on my own repo and copied the changes there. All credit of course goes to @jschueller as this is his work and not mine, I am merely shepherding it!

This is necessary for the upcoming work to integrate with SciPy, since the implementation of COBYLA that they have does provide a callback so obviously we don't want any regression in functionality upon switching to the PRIMA implementation of COBYLA.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check-spelling found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@nbelakovski
Copy link
Contributor Author

Also, @jschueller you're more than welcome to either make suggestions here or if you prefer to take over the work and make your own PR in favor of this one, that's fine too.

@jschueller
Copy link
Collaborator

jschueller commented Nov 27, 2023

great, I saw you rebased

while this PR was great to report progress,

my concern after that is that there is still no way to handle errors in the objective function callback itself (prima_obj/objcon) in order to handle a python exception for example, so I wondered if the objective function callback must return a "terminate" boolean too

what do you think ?

@nbelakovski
Copy link
Contributor Author

Welcome back @jschueller !

It sounds like you're thinking of the following scenario (using Python as an example but it could apply to other languages as well):

def my_cost_function(x):
  try:
    ...some computation...
  catch err:
     ... what do we do here?...

And further it sounds like you're saying that if we add a terminate output to the objective/cost function, then in the catch block we could set it, and then once we're back in the Fortran code we can gracefully terminate the routine instead of crashing.

Seems like a reasonable addition, I wonder what @zaikunzhang thinks?

@nbelakovski
Copy link
Contributor Author

Pushed updates now that the algo enum branch is merged.

c/bobyqa_c.f90 Fixed Show fixed Hide fixed
c/bobyqa_c.f90 Fixed Show fixed Hide fixed
c/bobyqa_c.f90 Fixed Show fixed Hide fixed
c/bobyqa_c.f90 Fixed Show fixed Hide fixed
c/cobyla_c.f90 Fixed Show fixed Hide fixed
@nbelakovski nbelakovski force-pushed the add_callback branch 2 times, most recently from 12d7779 to 50d9d29 Compare December 4, 2023 16:23
@nbelakovski
Copy link
Contributor Author

-"Fixed" the spelling issues
-Addressed @zaikunzhang 's comments from the previous PR that the callback should report the "current best" values as opposed to simply the values at the current iteration (if the user is very interested in those they can access them via the objective function)
-Added the callback at the end of the initialization
-Added a no-op callback so that we could properly pass the callback to the initialization. This flows better with the result of the optional arguments in the [algorith]a.f90 files

@nbelakovski
Copy link
Contributor Author

I spent most of today investing some of the build failures from CI. It's a long story, but the short version is that in my experiments with flang (the AMD optimizing C/C++ compiler, AOCC) I discovered that the only way to support this compiler is to make some rather ugly code changes. First of all, bobyqa.f90 and others cannot have the following code:

if (present(callback_fcn)) then
    callback_fcn_loc => callback_fcn
else
    callback_fcn_loc= => NO_OP_CALLBACK
end if

The problem is that while we can write call callback_fcn(...), we cannot copy the procedure pointer to another procedure pointer. Something gets lost in the copy and so call callback_fcn_loc(...) results in a segfault.

There is a workaround for this but it's not very pretty:

if (present(callback_fcn)) then
    call bobyqb(...., callback_fcn, ...)
else
    call bobyqb(..., NO_OP_CALLBACK, ....)
end if

The other problem is that we cannot call it from both bobyqa/initialize.f90 and then also from bobyqa/bobyqb.f90. I'm not sure why, something about too many levels. That one is solvable by simply not passing it to bobyqa/initialize.f90.

I don't know yet if these things will solve the issues with the other compilers as well (nvfortran and aflang are also having issues).

@zaikunzhang how would you feel about deciding that only gfortran and ifort will support interop with C? We can modify the CMake files so that they do not try to compile and run the C examples with flang/nvfortran/aflang, only with gfortran and ifort, and then we can keep the code a little bit cleaner. Does this make sense to you and does it sound sensible, or do you have thoughts on the topic you'd like to share?

@nbelakovski
Copy link
Contributor Author

my concern after that is that there is still no way to handle errors in the objective function callback itself (prima_obj/objcon) in order to handle a python exception for example, so I wondered if the objective function callback must return a "terminate" boolean too

what do you think ?

Discussed this with @zaikunzhang offline and we think that it doesn't make much sense for the objective function to offer a terminate functionality. If it’s truly necessary, the terminate functionality provided by the callback can be used. The objective function can be called within the callback, and the value of terminate can be set appropriately.

I’ve pushed an update. Via discussion with @zaikunzhang we decided to go with the proposed code above that works with the other compilers. I've also modified the manner in which @jschueller implemented both the callback and the call to the C objective function.

The previous setup looked like this (I use bobyqa as an example but this applied equally to all algorithms):

  1. bobyqc_c.f90 captures the pointer to the callback function in a closure.
  2. The closure is passed to bobyqa.f90 and then to bobyqb.f90
  3. bobyqb call the closure, which then checks if the pointer is valid, and if so it calls evalcallback and passes the pointer and the other arguments
  4. evalcallback then casts the pointer to a callable fortran procedure, casts the arguments to their C equivalents, and calls the function.

The new setup is as follows:

  1. bobyqa_c.f90 check if the callback function pointer is valid, and if so casts it to a callable fortran procedure. This procedure is captured in a closure.
  2. The closure is passed to bobyqa.f90 and then to bobyqb.f90 (same as before)
  3. bobyqb call the closure, which casts the arguments to their C equivalents and calls the functions.

So two things changed:

  1. The casting of the function pointer is done during the initial setup, as opposed to every time the function is called.
  2. evalcallback and the closure were combined, so there's fewer functions to follow (this was essentially the main motivation, as the code was difficult to follow as it bounced around between many functions)

This was then also applied to evalcobj.

It looks like there are still some failing builds, but the errors appear to be happening on main as well (at least the ones that I tested), so they don't appear to be related to the code in this PR.

@jschueller
Copy link
Collaborator

so how would you handle a function that cannot be evaluated (an exception for example) on a new point proposed by the algorithm ?

@nbelakovski
Copy link
Contributor Author

so how would you handle a function that cannot be evaluated (an exception for example) on a new point proposed by the algorithm ?

@zaikunzhang can speak in more detail on this, but from our conversations I understand that a common tactic in these situations is to return NaN. COBYLA and other algorithms are set up in such a way as to be able to recover from NaN by essentially going back and trying different points.

@jschueller
Copy link
Collaborator

ok, fine then

c/CMakeLists.txt Fixed Show fixed Hide fixed
c/tests/CMakeLists.txt Fixed Show fixed Hide fixed
fortran/CMakeLists.txt Fixed Show fixed Hide fixed
@nbelakovski nbelakovski force-pushed the add_callback branch 10 times, most recently from b657be1 to 460d0ba Compare December 14, 2023 07:53
@nbelakovski nbelakovski force-pushed the add_callback branch 5 times, most recently from 353f314 to 3552860 Compare December 19, 2023 14:56
c/CMakeLists.txt Fixed Show fixed Hide fixed
c/bobyqa_c.f90 Fixed Show fixed Hide fixed
c/cobyla_c.f90 Fixed Show fixed Hide fixed
c/lincoa_c.f90 Fixed Show fixed Hide fixed
c/newuoa_c.f90 Fixed Show fixed Hide fixed
c/uobyqa_c.f90 Fixed Show fixed Hide fixed
@nbelakovski nbelakovski force-pushed the add_callback branch 3 times, most recently from 53dddd3 to ce98064 Compare December 19, 2023 15:22
…segfault

Also disabled parallelism in cmake.yml so that the build log is more readable.
The extra time it takes is not a concern.
THe flang-derived compiler were beginning to have problem with the
empty array used for nlconstr for the non-COBYLA algorithms. By making
them optional arguments, we are able to handle them in a different way
which satisfies all the compilers.
It struggles with capturing a procedure pointer inside a closure,
although it appears capable of capturing a C function pointer. As a
result we move the C_F_PROCPOINTER call back inside the closure.
Now that we've moved the procedure pointer into the closure so that the new
intel compiler, ifx, works, the old intel compiler, ifort, has started to
have issues with that data_ptr that gets passed around.

The issue was observed while running data_c_exe. It passes a data pointer
and in the callback it checks to make sure that it gets back the same data
pointer. It started complaining.

Removing the value attribute seems to have fixed it, and this seems to work
for other compilers as well.
…cosmetic fixes

After reviewing the Fortran 2008, 2018, and 2023 standards, it looks like value is meant to
be an attribute for type(C_PTR) in the way that we are using it, so it is probably best
to leave it in and accept the fact that the classic Intel fortran compiler does not work in
this case. It's worth noting that the only case where there is an issue is when sending custom
data to the objective function, which is not a common use case, and hopefully those who need
that functionality are able to use a compiler other than the classic Intel one.

Minimal example: https://fortran-lang.discourse.group/t/problem-with-ifort-compiler-and-type-c-ptr/7011

Other various minor changes are made based on feedback from an in-person review session
with @zaikunzhang
c/bobyqa_c.f90 Fixed Show fixed Hide fixed
c/cobyla_c.f90 Fixed Show fixed Hide fixed
c/lincoa_c.f90 Fixed Show fixed Hide fixed
c/newuoa_c.f90 Fixed Show fixed Hide fixed
c/uobyqa_c.f90 Fixed Show fixed Hide fixed
…, etc. represent

-fixed bug in how cobyla reported nonlinear constraints
-removed 'early' from the termination string
-fixed comments in cobyla code related to the size of constraint arrays
@zaikunzhang zaikunzhang merged commit 8ce9aae into libprima:main Dec 20, 2023
34 checks passed
@nbelakovski nbelakovski deleted the add_callback branch December 20, 2023 11:58
@zaikunzhang
Copy link
Member

This is a milestone! Thank everyone!

@nbelakovski
Copy link
Contributor Author

Woo hoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants