-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pure function notation #414
Comments
We have had a fair amount of discussion in the past about this. It's good to have this issue to discuss so that it remains on the priority list. |
@si14: do you mean the ability to annotate a function as pure or for the compiler to determine that the function is pure? |
Oh, nevermind. I just read the issue subject and it's clear. I once proposed |
Yeah, it's good to be able to declare functions pure (optional, unchecked declaration). Even C has that now. |
I guess the interface could be -viral On 20-Feb-2012, at 11:53 PM, JeffBezanson wrote:
|
If we can't be sure the procedure is indeed pure, we wouldn't be able to optimize it without risking (potentially really bad) undefined behavior. As for possible optimizations, something like C++ AMP comes to mind. |
Pure functions would only be allowed to call other pure functions — calling and impure function would be an error (probably run-time, but maybe compile-time). That way you can be sure a function is pure. Each intrinsic would need to be marked as pure or impure; likewise, one would want to be able to indicate whether each ccall is pure or impure. Of course one could lie or be mistaken there, in which case all hell could break loose, but that's what you get for lying. |
@StefanKarpinski |
Here is an example of what I mean: #249 — this "!" is in fact an "effect". |
And here is an example of "deforestation" optimization: http://arma.sourceforge.net/ . This library uses C++ templates to implement such optimization, and as you can see here http://arma.sourceforge.net/speed.html it's fast. |
Armadillo is really, really interesting. We've talked a lot about doing things like this to avoid creation of temporaries when executing linear algebra code. The main limiting factor actually has been a collective aversion to doing things that are not dead obvious and dead simple — I'd much rather give the programmer the ability to write this easily in an explicit way than to have the compiler do it automatically in a clever way that might not be obvious to the programmer. At some point (a while ago), I was arguing for the |
@StefanKarpinski , the thing that you are talking about is somewhat different from doing IO, for example. There are different aspects of "purity" and it would be better to have a uniform way to denote different aspects of function behaviour. Let me show you an example in a form of pseudocode.
Let's think about this |
Annotating things in that much detail is a non-starter. It's way too much book-keeping to foist on the programmer. The language has to be usable by people who don't know and don't care what different kinds of functional impurity are. On the other hand, if the compiler can compute and track all of these things for the programmer, then it's cool. |
@StefanKarpinski Of course compiler can infer effects, it's the whole point :) Moreover, compiler should do it even if programmer explicitly annotate function in some way to control if programmer was mistaken. Look, rules for inferring effects are kinda easy (though different for different effects): some will simply propagate through functions (like "allocation" one — you can't build a function that uses allocating function and don't allocate itself), some will be little harder to infer (like "mutates it's arguments" — "outer" function will be "mutative" only if it passes it's arguments to "mutative" function), but it's definitely useful. |
I'll hop on IRC at some point, but can't today. I'm on vacation in Buenos Aires... |
What about 'semipure' functions. For instance, memoized functions;
It isn't thread-safe, but |
Maybe a silly comment (sorry if it is), but... Maybe @StefanKarpinski I hope did you enjoy your vacations here? :) |
This discussion is getting out of control. It's way more important to have an unchecked The ability to annotate pure functions is incredibly important in numerical applications in order to implement constant-folding optimizations, because constant expressions like Hence (For the same reason, I suspect that this needs to be a keyword recognized by the parser, not a macro, so that the pure annotation can be passed through to the compiler.) |
(By the way, it seems to me that @si14's original example is not pure, because pseudo-random-number generation both depends on and modifies global state, assuming |
@stevengj you are definitely right about purity of generate_random_matrix :) |
I think we should learn from Rust's experience and just close this issue: http://thread.gmane.org/gmane.comp.lang.rust.devel/3674/focus=3855 If this couldn't be made to work in Rust, which is statically compiled and relatively much more willing to make its users just through hoops (their target: professional programmers who write large, complex systems software; our target: scientists), then there's no way this will ever fly in Julia. |
Note that Rust apparently made everything pure by default and required a keyword for non-pure functions, according to that post, which proved unworkable for them. That would definitely not fly for our audience, I agree. Nor, as I argued above, should we worry about compiler-enforced purity (any more than Julia enforces But there have been plenty of working examples of optional |
It would be a shame for |
Impure-by-default with a |
@lindahua, +1 on a "safe" mode. (I would prefer "safe" to "debug".) |
Yes, that's kind of my point. When writing any code that's generic, you can very easily find yourself in a situation where you have to split the exact same definition into pure and impure versions based on very subtle differences. Nobody is realistically going to do that. At that point, purity is a property that's better off automatically inferred rather than manually annotated. The only thing standing in the way of pretty decent bottom-up inference of purity, at least in situations where we've figured out exactly what method gets called, is annotation of pure vs. impure ccalls. |
That is a good point; |
Doesn't |
That's the beauty of ===. It takes care of most of these issues if you just
|
I think Jeff's rule is sort of right and sort of wrong - it's basically what the compiler should pretend the "pure" declaration means, with the understanding that some functions declared pure will not actually obey the rule. The point is to enable optimizations, so when the programmer declares a function to be pure, it should mean that the compiler is free to evaluate it as many times as it wishes whenever it wishes for a particular argument, and at any time return any one of those evaluations in whatever arbitrary way, and the programmer asserts that the program won't break because of it. It should also be method-specific, not generic-function-specific (this is an optimization, so the narrow claim is safer and sometimes the wider claim won't even be possible). It is entirely reasonable to declare sin(::Array{Float64})) pure. The result won't always be a new copy depending on CSE optimizations, but then again, that's not anything new with array expressions. In e.g. Python if you do A[1:3], you won't get a copy either, and people accept that for performance reasons, using A[1:3].copy() when they really want a copy. That's not in principle any different from sin(A) not returning a new unshared array, and you can still use sin(A).copy() if desired. I'm not saying that the base sin() function should do this, just that it is not unreasonable to declare the implementation of such a function pure. If the base version does not do that, it is easy enough to implement a module with pure-declared wrappers of functions when such semantics are desired. If the compiler simply accepts the pure declaration at face value without any checking, it leaves the users the ability to implement functions with that kind of API. Even Haskell has the unsafePerformIO escape hatch for when you want something to be declared pure but not really technically be so. |
There's a difference between a function like |
True, but still there is a common use case where you'd want a pure-declared sin(::Array{Float64})) to enable CSE and compile-time evaluation, and it is entirely safe. This happens when you write entirely pure mathematical functions, where by "pure" I mean pure-as-in-Haskell. You don't modify anything, so you don't care whether something is a reference or a copy. You could have a module with pure-declared versions of functions like sin() designed for that use case. If the final result of such a computation is an array (instead of e.g. a scalar) and you want to ensure that it is a new one, then you can do a copy as a final step to hide any optimization effects. Basically what I'm trying to say that the simplest implementation of "pure" is probably the most useful one: the programmer asserts that the compiler can do its optimizations with no checking, and that's it, much like it is with |
Haskell is a completely off-base model here since you can't semantically modify anything. In Julia – or any language with semantically mutable arrays – deciding arbitrarily whether to return a copy or the same array is totally nuts. |
With #13555, there is now an (unchecked, unexported) It would also be nice to have a |
* Add `start` command to REPL mode * Add support for quoted args in REPL mode * Add support for quoted args * Add more tests for quoted args in REPL * Remove function accidentally commited to master * Replace accidentaly removed UUID from Project.toml * Remove double quote from tests (not supported by Windows)
Stdlib: SparseArrays URL: https://github.com/JuliaSparse/SparseArrays.jl.git Stdlib branch: main Julia branch: master Old commit: b4b0e72 New commit: 99c99b4 Julia version: 1.11.0-DEV SparseArrays version: 1.10.0 (Does not match) Bump invoked by: @dkarrasch Powered by: [BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl) Diff: JuliaSparse/SparseArrays.jl@b4b0e72...99c99b4 ``` $ git log --oneline b4b0e72..99c99b4 99c99b4 Specialize 3-arg `dot` for sparse self-adjoint matrices (#398) cb10c1e use unwrapping mechanism for triangular matrices (#396) b3872c8 added warning for iterating while mutating a sparse matrix (#415) f8f0f40 bring coverage of fixed SparseMatrixCSC to 100% (#392) 0eb9c04 fix typos (#414) ``` Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
Stdlib: SparseArrays URL: https://github.com/JuliaSparse/SparseArrays.jl.git Stdlib branch: main Julia branch: master Old commit: b4b0e72 New commit: 99c99b4 Julia version: 1.11.0-DEV SparseArrays version: 1.10.0 (Does not match) Bump invoked by: @dkarrasch Powered by: [BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl) Diff: JuliaSparse/SparseArrays.jl@b4b0e72...99c99b4 ``` $ git log --oneline b4b0e72..99c99b4 99c99b4 Specialize 3-arg `dot` for sparse self-adjoint matrices (#398) cb10c1e use unwrapping mechanism for triangular matrices (#396) b3872c8 added warning for iterating while mutating a sparse matrix (#415) f8f0f40 bring coverage of fixed SparseMatrixCSC to 100% (#392) 0eb9c04 fix typos (#414) ``` Co-authored-by: Dilum Aluthge <dilum@aluthge.com> (cherry picked from commit 6691a75)
There are a lot of optimization opportunities coming from the ability to distinguish pure functions from unpure. The most important one (IMO) is deforestation: when you have something like
sum(transpose(transpose(generate_random_matrix(M, N)) * SCALAR))
, you can optimize away an entire matrix construction if you know thatsum
consumes matrix,generate_random_matrix
builds one and all functions that was applied to generated matrix are pure. So it would be handy to have an ability to say that the particular function is pure.The text was updated successfully, but these errors were encountered: