Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: move log/exp variants to a separate package #46

Closed
tpapp opened this issue Jun 18, 2018 · 45 comments · Fixed by #108
Closed

RFC: move log/exp variants to a separate package #46

tpapp opened this issue Jun 18, 2018 · 45 comments · Fixed by #108

Comments

@tpapp
Copy link
Contributor

tpapp commented Jun 18, 2018

I am thinking of the functions in src/basicfuns.jl.

In contrast to distribution functions, these are all implemented in pure Julia, require no external dependencies, and thus have no license issues. In fact, while they are used in statistics, they are not even "statistical" functions since other fields make ample use of them. Cf #20; these functions are useful for the end user directly, without requiring an interface like Distributions.jl.

I propose LogExpFunctions.jl as a package name. If the discussion suggests this makes sense, I would be happy to do a PR.

@andreasnoack
Copy link
Member

We have previously discussed this and I think these could and should just be moved to StatsBase since StatsBase has no binary dependencies and is most likely already installed on all systems.

@nalimilan
Copy link
Member

These are not really related to statistics though, they could live in a separate stdlib module.

@andreasnoack
Copy link
Member

I think these are mainly used in the evaluation of functions used in statistics so until there is a demand outside statistics I think it is to avoid scattering functionality into more packages.

@ararslan
Copy link
Member

I agree with Andreas, I think StatsBase is a good home for them, or at least good enough.

@bdeonovic
Copy link

@tpapp I think this sounds good. Also if someone does move these over please check the https://github.com/JuliaStats/StatsFuns.jl/tree/promotions branch which has several updates to logpdfs/pdfs that I have been wondering if they would ever make it into master

@cossio
Copy link
Contributor

cossio commented Jun 7, 2019

As a user I would appreciate a stable home for these functions so that I can use them without worrying their location might change in the future. I like @tpapp suggestion because I think these functions pop up in many different areas. But on the other hand why not just leave them here in StatsFuns?

@tpapp
Copy link
Contributor Author

tpapp commented Sep 15, 2019

I think that a separate repository would make a lot of sense, as those who just want eg logaddexp would not need to depend on librmath. Also, these functions are used not only in statistics, but modeling, machine learning, etc.

I created https://github.com/tpapp/LogExpFunctions.jl and moved the log/exp functions there, with the following improvements:

  • docstring cleanup
  • set up generated docs
  • significantly improved coverage (now 100%, all branches of special functions are covered)

I am very happy to transfer this repository to JuliaStats, then after it is registered, make a PR removing this functionality from this package (after a deprecation period, during which one would reexport).

Comments are welcome.

@nalimilan
Copy link
Member

Fine with me. You'll probably want to port over #74.

@tpapp
Copy link
Contributor Author

tpapp commented Sep 16, 2019

Happy to do that.

How should I proceed about transferring that package to JuilaStats?

Should I register before, or after that?

@tpapp
Copy link
Contributor Author

tpapp commented Sep 17, 2019

I started registration: JuliaRegistries/General#3594

@tpapp
Copy link
Contributor Author

tpapp commented Sep 17, 2019

@nalimilan, if you have permissions can you please help with with making the transfer of the above package to JuliaStats?

@ararslan
Copy link
Member

If this is its own package, why does it need to live in JuliaStats?

@tpapp
Copy link
Contributor Author

tpapp commented Sep 17, 2019

Sorry, I don't fully understand the question here. My understanding was that JuliaStats organizes key packages for statistics in Julia. While I am happy to maintain LogExpFunctions, I would like others to have write access to the repo, and also I don't want to just copy the work of other people to create a new package without at least putting it in the same Github organization. Hope this makes sense.

@ararslan
Copy link
Member

Sorry, I missed the part where the other functions were moved to that package.

@ararslan
Copy link
Member

I understand the desire to decouple the functionality in StatsFuns from that provided by the binary dependency on rmath, but at the same time I think this is the most natural home for these functions. It seems unnecessary to have a separate repository just for them.

@xukai92
Copy link
Contributor

xukai92 commented Sep 17, 2019

Any possibility those pure Julia functions could be moved to Statistics?

@tpapp
Copy link
Contributor Author

tpapp commented Sep 18, 2019

@ararslan: I think that separate packages make a lot of sense, especially they allow decoupling native Julia code and wrappers for an external library, where the two sets of functions not closely related in the first place. Very importantly, they allow a more granular versioning, and also the cost is near-zero with Pkg3.

@xukai92: While these functions are used in statistics, they also have other applications (machine learning, modeling in natural and social sciences). Base.Math would also be a good home for some of them. But I am under the impression that the core devs are now quite reluctant to move things into Base or the standard libs unless there is a compelling reason. In any case, perhaps open an issue for that in the Julia repo if you want to propose it.

@xukai92
Copy link
Contributor

xukai92 commented Sep 18, 2019

Yeah I understand those functions are used outside statistics but so do functions like mean, right? I think the motivation to decouple the dependency is very nice, just personally feel a little bit inconvenient to load another package just for those functions though. But I might just be too lazy.

@tpapp
Copy link
Contributor Author

tpapp commented Sep 18, 2019

I am waiting for feedback from the maintainers here about what they would prefer: I can either

  1. make a PR deprecating all of these functions and direct users to LogExpFunctions,
  2. make a PR just reexporting LogExpFunctions,
  3. make no PR for this package.

@nalimilan
Copy link
Member

Maybe file an issue in Julia so that we can discuss what could be moved to Statistics or Base.Math. For example, xlogx could reasonably live in Base.Math. logistic and logit would also be nice additions to Statistics. Or maybe a package with a broader name (e.g. MathFuns) could be created, so that more things can be added to it later.

@cossio
Copy link
Contributor

cossio commented Sep 18, 2019

If the main reason for moving these functions away from this package into their own package is that StatsFuns has a binary dependence that the "logexp" functions don't need, one can also think that in the future it might make sense to get replace the binary dependence with pure Julia code, in which case this argument will not hold. I like more @ararslan idea of leaving these functions here for good.

@tpapp
Copy link
Contributor Author

tpapp commented Sep 18, 2019

it might make sense to get replace the binary dependence with pure Julia code

This has been proposed multiple times, but AFAIK never got off the ground. This would of course be ideal, but since this is a major undertaking that requires some expertise, I would not count on this happening in the medium run.

If someone is willing to undertake this, then we can discuss this as a realistic option, but otherwise I don't see this as a practical solution to the binary dependency problem. Conversely, if/when this happens, all numerical functions could be consolidated back into a single package.

Log/exp functions are a bit special since frequently one just requires an adjustment or a branch for large/small values, and then can use basic math and simple building blocks to code it natively (the only exception is _log1pmx_ker). Other special functions have more involved approximations which are more opaque to non-experts.

@cossio
Copy link
Contributor

cossio commented Mar 3, 2020

To avoid code duplication, can StatsFuns re-export the functions from https://github.com/tpapp/LogExpFunctions.jl? This is not a breaking change.

@tpapp
Copy link
Contributor Author

tpapp commented Mar 3, 2020

Just to add to this: I am happy to add any of the StatsFuns maintainers to that repo (just ask in an issue). The intention was, and still is, to avoid binary dependencies for these functions.

@cossio
Copy link
Contributor

cossio commented Mar 3, 2020

To avoid code duplication, can StatsFuns re-export the functions from https://github.com/tpapp/LogExpFunctions.jl? This is not a breaking change.

I can make the PR for this, if folks agree to merge.

@devmotion
Copy link
Member

I think it would be helpful to separate the log/exp functions from StatsFuns, also to address FluxML/NNlib.jl#252 and to be able to share logsumexp, softmax, logistic, softplus, etc. between StatsFuns and NNlib without having to depend on Rmath. It seems StatsFuns and LogExpFunctions diverged a bit over time (e.g., StatsFuns now uses a one-pass algorithm for logsumexp that fixed many open issues #97), so I am wondering what would be the best way forward here. Would it be easier to just extract the current state of the log/exp implementations to a separate package, maybe even preserving the git history, or should one try to update LogExpFunctions?

@nalimilan
Copy link
Member

@tpapp knows better, but maybe starting again from the state of StatsFuns is simpler. Then I would suggest having StatsFuns reexport these functions to avoid any breakage.

But in parallel I think someone should file an issue against Julia to discuss the possibility of moving all or some of these to Base or Statistics.

@tpapp
Copy link
Contributor Author

tpapp commented Dec 27, 2020

I would be happy to update LogExpFunctions with the changes in StatsFuns. However, the main purpose of that package was to demo the possibility of moving these out of StatsFuns, so it makes most sense if functionality is moved there (and possibly reexported by StatsFuns if necessary).

This requires a decision from StatsFun maintainers. If they are OK with this plan, I will of course move the LogExpFunctions repo to JuliaStats and do the updates.

@cossio
Copy link
Contributor

cossio commented Dec 28, 2020

It would be nice to make a decision here, since this is a breaking change and the later the decision is made the more breakage it can cause. I think many packages depend on these functions.

@devmotion
Copy link
Member

As far as I can see, this would not be breaking if StatsFuns would reexport these functions, as suggested above.

@devmotion
Copy link
Member

Probably it would still be nice to make a decision to avoid that NNlib, StatsFuns, and LogExpFunctions continue to diverge even more.

@nalimilan
Copy link
Member

I'd be fine with moving these functions to LogExpFunctions (and reexport them). But as I said above:

But in parallel I think someone should file an issue against Julia to discuss the possibility of moving all or some of these to Base or Statistics.

@devmotion
Copy link
Member

I don't think the implementation of, in particular, logsumexp and softmax is stable enough yet to be moved to Base or a stdlib. There were some major changes in the logsumexp implementation quite recently and there is still an open PR with a weighted logsumexp implementation, and the softmax function is missing some functionality (e.g., dims keyword arguments) that is implemented in NNlib. Also probably the implementation of some other of these basic functions should be synchronized with NNlib first to ensure that they work on GPUs and do not pose any problems for AD, if possible.

@nalimilan
Copy link
Member

Changing implementations in Base isn't a problem AFAIK. We recently radically changed the implementation of mean to avoid overflow in Statistics. Adding functionality progressively is also OK. Anyway, I'm merely suggesting that we discuss the principle of moving these to the stdlib, this doesn't have to happen immediately.

@cossio
Copy link
Contributor

cossio commented Jan 2, 2021

But for now let's just put them in their own package, which can be a stable home for them (even if later on they are also ported to the stdlib).

@cossio
Copy link
Contributor

cossio commented Jan 28, 2021

Can we consider registering LogExpFunctions?
JuliaStats/LogExpFunctions.jl#5

The name LogExpFunctions is pretty explicit, so I don't think it will lead to issues later on if a decision is finally made regarding where to port / re-export these functions.

@devmotion
Copy link
Member

Wouldn't it make sense to update LogExpFunctions (or create a new package with the current version of these functions in StatsFuns) first, as discussed above?

@tpapp
Copy link
Contributor Author

tpapp commented Jan 28, 2021

I am happy to register, but currently have no bandwidth for the updates. PRs welcome.

@cossio
Copy link
Contributor

cossio commented Jan 28, 2021

Wouldn't it make sense to update LogExpFunctions (or create a new package with the current version of these functions in StatsFuns) first, as discussed above?

If this update doesn't imply an API change, I vote to register LogExpFunctions now, and update later.

Even if there is an API change, we can just respect SemVer. Let's register?

@devmotion
Copy link
Member

At least you would want to configure CI, CompatHelper, TagBot etc. properly, no? In general, I don't see why it would be useful to rush this - who would want to use a package whose implementations are outdated and strictly inferior to the versions in StatsFuns? 😄

@tpapp
Copy link
Contributor Author

tpapp commented Jan 28, 2021

I am happy to fix CI etc if someone is willing to port #74.

Generally, I see no problem with this functionality being in its own lightweight package. I don't agree that it is used only in statistics, these functions are used in other areas.

OTOH, I also want to be a good citizen here and not just duplicate functionality from other packages without some consensus about the big picture. Again, LogExpFunctions was (is?) just a demonstration of what that lightweight package could look like.

@devmotion
Copy link
Member

#97 is a more recent PR that completely changed the implementation of logsumexp and fixed a bunch of open issues (and also #100 which someone should close...).

@devmotion
Copy link
Member

I'll put together a PR with #97 later today.

@devmotion
Copy link
Member

JuliaStats/LogExpFunctions.jl#6

@tpapp
Copy link
Contributor Author

tpapp commented Jan 29, 2021

Thanks to @devmotion, LogExpFunctions.jl is now up to date.

I initiated registration, and the transfer of the repo to JuliaStats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants