Document @generated functions #10673

tomasaschan · 2015-03-29T20:07:29Z

stagedfunctions have been around for a while now, and we've successfully used them in Interpolations.jl, but they are sadly without documentation. I know @timholy wants to help with it, but he has more important stuff on his plate, so I figured I'd help out. I don't claim to be an expert in the inner workings of this black magic in any way, but as an "outsider" I might even be more suitable to write these docs than someone who knows all the gory details, since can't go into them ;) However, this also means that these docs need to be extra carefully fact-checked; I've tested all the examples locally, but I can't guarantee that I've understood why they work the way they do.

This is still just WIP - as indicated toward the end, I want to add another section with a more advanced example that demonstrate stagedfunctions doing something in a way that's more convenient than doing it with macros and/or regular functions, but I couldn't come up with a good case that didn't require a lot of background. Any suggestions are welcome.

Also, I couldn't figure out how to run the doc tests locally. I know I knew how to do this once, but I've forgotten and couldn't find the info anywhere I looked (README.md in both project root and under doc, CONTRIBUTING.md and at docs.julialang.org). If it's not just me being illiterate, maybe that too needs to be documented...

tomasaschan · 2015-03-29T20:08:20Z

Also, I'm really hesitant to add this at the end of an already huge section in the manual, but I didn't know where else to put it. Any suggestions here are of course also welcome!

timholy · 2015-03-29T20:20:17Z

Very nice! And thanks very much.

timholy · 2015-03-29T20:27:12Z

doc/manual/metaprogramming.rst

+more advanced magic than just regular functions; enter *staged functions*.
+Staged functions have the capability to generate specialized code depending
+on the *types* of the arguments you give them, so that you can optimize or
+generalize your code in ways that aren't possible with ordinary functions.


A couple of points:

This description makes it sound as if stagedfunctions are a little less powerful than macros, but mostly they are just different (and there are a good number of things you can do with stagedfunctions that you can't do with macros). The key distinctions between macros and stagedfunctions are:

macros only work with expressions, and so don't know the types of the inputs

macros work at parsing-time, whereas stagedfunctions get expanded on-demand, possibly differently for each set of types

One way I like to conceptualize stagedfunctions is that they provide a flexible framework to move work from runtime to compile-time. I'll see if I can come up with a short example below.

I think everything is there now:

Stagedfunctions play a similar role as macros, but at a later stage between parsing and run-time. Staged functions give the capability to generate specialized code depending on the types of their arguments. Macros work with expressions at parsing-time and cannot access the types of their inputs. In contrast, a stagedfunction gets expanded at a time when the type of the arguments is known, but the function is not yet compiled.
Depending on the types of the input, a staged function returns a quoted expression which then forms the function body of the specialized function. Staged functions provide a flexible framework to move work from runtime to compile-time.

mbauman · 2015-03-29T20:43:43Z

This is a great start! Some random thoughts:

I definitely agree it belongs in the Meta-programming section. If it starts to feel unwieldy, we could maybe split it apart later (maybe between reflection and generation?)
Since it immediately follows Macros, I think it'd be good to highlight their differences more. Ah, I see Tim just made the same comment as I was typing this. I'd also add a note about why hygiene isn't a concern.
One of the trickiest things for me as I was learning how to use stagedfunctions was understanding how the arguments represent either the passed value or its type depending upon the context. Perhaps it's be good to have an example that simply returns both the value and type, e.g., :(x, $x), as a first step.
It'd be good to assert that the code generation must be deterministic and type-stable to get well-defined behavior.

timholy · 2015-03-29T21:16:56Z

Perhaps sub2ind is a good all-round example? One could provide 3 implementations: the current loop-based one, one based on stagedfunctions (which would illustrate building up a single expression), and one based on recursion (#10337). This might illustrate the ideas, as well as point out that there are sometimes good alternatives to using stagedfunctions (which might make @JeffBezanson happy).

An earlier version of Jutho's PR was based on stagedfunctions; does anyone know how to find a commit prior to a force-push?

tomasaschan · 2015-03-30T07:07:44Z

Thanks for the comments, @timholy and @mbauman!

@timholy: Yes, I agree that first paragraph isn't ideal. It was the first thing I wrote, mainly just to get started, so I'll be happy to try to re-word it. sub2ind seems like a good example function, especially since there are several other implementations to compare with. I couldn't find a stagedfunction implementation among the ones you linked to - do you know if there is one lying around somewhere, or should I invent my own?

@mbauman: Yes, the distinction between when the argument x is a type and when it's a value was the hardest bit to grasp for me as well. I tried to illustrate how this works by means of a println(x) statement in the body, before returning an expression, to avoid adding interpolation to the mix. Did you think this wasn't clear enough?
Also, deterministic code generation is a good thing to mention. An example like

stagedfunction foo(x)
    if rand() < .5
        return :(x)
    else
        return :("boo!")
    end
end

could be used both to illustrate that the code in the body is only actually run once (we get the same result every time we execute this function, but we don't know until we've done it once if it's going to be x or "boo!"...) and the returned expression is re-used for the same type after that.

Edit: sorry, managed to tag @mauro instead of @mbauman. Leaving this note here so you don't get confused over why you got a notification :)

mauro3 · 2015-03-30T08:55:19Z

doc/manual/metaprogramming.rst

+compiled. After that, the expression returned from the ``stagedfunction`` on the
+first invocation is re-used as the method body.
+
+We can utilize this to do slightly weirder things:


Maybe instead something like "The example staged function foo above did not do anything a normal function foo(x)=x*x could not do, except printing the the type on the first invocation. However, the power of a staged function lies in its ability to compute different quoted expression depending on the types passed to it:"

timholy · 2015-03-30T09:50:16Z

do you know if there is one lying around somewhere, or should I invent my own?

That PR originally used stagedfunctions, so somewhere there should be an old commit that is no longer on a named branch. But I don't know how you find it, and I suspect that (at least for me) it would take longer to figure that out than to rewrite it.

tomasaschan · 2015-03-31T08:44:09Z

There, I clearly killed the build with whitespace errors.

Is it documented somewhere what checks are run on documentation edits, and how to run them locally?

mauro3 · 2015-03-31T12:47:30Z

cd doc
make html

should make a doc/_build/html/index.html which you can open in the browser. You need sphinx installed. Although, afaik whitespace is not checked for, I think that is a Travis thing.

mauro3 · 2015-03-31T12:54:30Z

https://github.com/JuliaLang/julia/blob/master/doc/README.md

tomasaschan · 2015-04-01T06:04:37Z

@mauro3, I did that and it worked without error. I was hoping there was somewhere I could read more about all the checks run by Travis, to avoid pushing stuff that won't build...

mauro3 · 2015-04-01T06:40:57Z

Yes, sorry, I should read the question! It looks like Travis chocked on

make check-whitespace

which you can run in the top-directory of your Julia install. Does that give you errors? I think, that is the only extra test run, apart from the Julia unit-tests. At least that is how I interpret: https://github.com/JuliaLang/julia/blob/master/.travis.yml

tomasaschan · 2015-04-06T11:56:51Z

@timholy: I find it difficult to wrap my head around how to implement sub2ind as a stagedfunction. Any hints (or even a full implementation) would be most welcome - I can wrap it in explaining text, but I don't understand the algorithmic idea behind implementing it.

(Slightly OT below...)

I did, however, find a way to inspect all the "dangling" commits on my local git tree. I didn't have anything on there from the previous version of the PR (no surprise there) but I figured someone else might. This is what I did (in bash):

# find all the commit hashes of dangling commits
git fsck --lost-found | awk '{ print $3 }' > hashes.txt
# loop through them and open each in a browser window
for ((i=1; i<=$(wc -l hashes.txt | cut -f 1 -d ' '); i++)); do echo $i; chromium-browser "https://github.com/julialang/julia/commits/`head -$i hashes.txt | tail -1`"; done

On my machine this opened 52 new tabs, which took a while but by no means was a problem for my laptop. If you have many dangling commits, though, this might be too much; check how many you have with wc -l hashes.tmp and hand-craft the loop limits if necessary to split it up in portions. The implementation from #10337 might be salvage-able if we really want to find it :)

timholy · 2015-04-06T13:50:32Z

Briefly it would look something like this (not tested):

stagedfunction sub2ind{N}(dims::NTuple{N}, indexes...)
    ex = :(indexes[$N]-1)
    for i = N-1:-1:1
        ex = :(indexes[$i]-1 + dims[$i]*$ex)
    end
    :($ex + 1)
end

As you can see, it's almost identical to a function version that uses loops:

function sub2ind{N}(dims::NTuple{N}, indexes...)
    ind = indexes[N]-1
    for i = N-1:-1:1
        ind = indexes[i]-1 + dims[i]*ind
    end
    ind + 1
end

but you don't actually create a runtime loop with the stagedfunction (check the final expression built by the stagedfunction).

That said, I wouldn't be shocked if LLVM might be able to do the same thing in this case.

goretkin · 2015-04-06T19:50:53Z

doc/manual/metaprogramming.rst

+In short: don't do this.
+
+While these examples are perhaps not so interesting, they have hopefully
+elped illustrating how staged functions work, both in the definition end


"...they have hopefully helped to illustrate how..."

Thanks! I'll make sure to correct that.

tomasaschan · 2015-04-07T07:31:01Z

@timholy Fast and helpful as always :) I think it was the possibility to interpolate ex into the new ex in the loop body that I didn't think of. A few quick tests show me that it gives the same results as the other two versions, so this is definitely all I need to finalize this PR. Thanks a lot!

tomasaschan · 2015-04-07T08:54:09Z

There - I fixed some typos and grammar issues, and completed the example. If there are no outstanding issues with the text I feel "done" with this (for now - documentation is never finished...).

I'm happy to squash and/or rebase this if desirable.

tomasaschan · 2015-04-07T10:20:08Z

FWIW, I did a quick benchmark of the three approaches in the advanced example:

using Benchmark

function sub2ind_loop{N}(dims::NTuple{N}, I::Integer...)
    ind = I[N] - 1
    for i = N-1:-1:1
        ind = I[i]-1 + dims[i]*ind
    end
    ind + 1
end

sub2ind_rec(dims::()) = 1   
sub2ind_rec(dims::(),i1::Integer, I::Integer...) =
    i1==1 ? sub2ind(dims,I...) : throw(BoundsError())
sub2ind_rec(dims::(Integer,Integer...), i1::Integer) = i1
sub2ind_rec(dims::(Integer,Integer...), i1::Integer, I::Integer...) =
    i1 + dims[1]*(sub2ind(Base.tail(dims),I...)-1)

stagedfunction sub2ind_staged{N}(dims::NTuple{N}, I::Integer...)
    ex = :(I[$N] - 1)
    for i = N-1:-1:1
        ex = :(I[$i] - 1 + dims[$i]*$ex)
    end
    :($ex + 1)
end

loop() = sub2ind_loop((101,235,1249,325,1992,123,59), 67, 129, 875, 125, 11, 89, 46)
rec() = sub2ind_rec((101,235,1249,325,1992,123,59), 67, 129, 875, 125, 11, 89, 46)
staged() = sub2ind_staged((101,235,1249,325,1992,123,59), 67, 129, 875, 125, 11, 89, 46)

#correctness and warmup
@assert loop() == rec() == staged()

#bench
compare(10_000, loop, rec, staged)

After a gzillion ambiguity warnings from Benchmark.jl, the results seem to indicate that the loop and the recursive approach are compatible in performance (running workspace(); include("bench.jl") a couple of times yields a different winner between the two between runs) and the staged function approach is slightly faster. I have no idea how much of this is because of inlining and other compiler magic, though, so the benchmark might be altogether rubbish...

timholy · 2015-04-07T15:05:25Z

If you define benchmarks like this:

function run_loop(n)
           s = 0
           for j = 1:n
               for I in CartesianRange(CartesianIndex((5,5,5,5)))
                   s += sub2ind_loop((5,5,5,5), I[1], I[2], I[3], I[4])
               end
           end
           s
       end

then you'll see that sub2ind_loop is much slower than the other two (which are equivalent):

julia> @time run_loop(10^4)
elapsed time: 0.442542374 seconds (368 MB allocated, 4.93% gc time in 17 pauses with 0 full sweep)
1956250000

julia> @time run_rec(10^4)
elapsed time: 0.024556659 seconds (224 bytes allocated)
1956250000

julia> @time run_staged(10^4)
elapsed time: 0.02548805 seconds (224 bytes allocated)
1956250000

That turns out to be because of splatting (which you can tell because of the memory allocation).

tomasaschan · 2015-04-07T15:20:11Z

That makes sense - thanks for pointing it out!

mbauman · 2015-04-07T16:08:54Z

It's so difficult to benchmark just the compiler magic you want to occur in typical situations without getting too much or too little magic.

pao · 2015-04-07T16:57:06Z

@tlycken For the future, the syntax to skip AppVeyor is [av skip] with a space, not a dash. This commit did run on AV (and passed).

tomasaschan · 2015-04-07T20:05:44Z

@pao, right, thanks. I noticed that it ran, but figured I'd do more harm than good trying to figure it out by trial and error. Next time I'll get it right! :)

tomasaschan · 2015-04-12T11:56:45Z

Is this waiting on me to do anything more here?

vtjnash · 2015-04-16T19:20:41Z

doc/manual/metaprogramming.rst

+        :($ex + 1)
+    end
+
+**What code will this staged function generate?**


it feels like there needs to be an @code_expand or something similar that will do this without the extra work described below. it may be worth having an issue to revisit this later.

tomasaschan · 2015-04-17T08:30:52Z

@vtjnash Thanks for a very thorough proof-reading! :)

mbauman · 2015-04-20T15:19:27Z

New name! Do you mind going back through and updating the vocabulary, @tlycken?

mbauman · 2015-04-20T17:49:07Z

doc/manual/metaprogramming.rst

+the types of the arguments are known, but the function is not yet compiled.
+
+Depending on the types of the arguments, a staged function returns a quoted
+expression which then forms the function body of the specialized function.


I'm not a computer science expert, but I think you want to be careful about the distinction between functions and methods. I think this would be more correctly stated as "a staged function returns … which forms the method body of the specialized method". But I may be wrong or splitting hairs not worth splitting.

You're absolutely correct - we should be careful, especially since this is quite a difficult topic to wrap one's head around in the first place. I'll fix it when I go over and change the terminology to generated functions.

This is a rebase/squash of the following commits: 8ffd923 Doc: stagedfunctions. Basic functionality and simple example 471befa Update stagedfunction doc according to comments cd19d47 Start on advanced example e02d21f Correct syntax typos 293ec59 Correct typo and improve grammar c42321b Complete advanced example

* Make `sub2ind_staged_impl` define `N` correctly * Add a note that staging *might* occur more than once.

tomasaschan · 2015-04-20T20:25:44Z

@mbauman Done!

I'd be grateful for a new round of proofreading - I might very well have missed some things in the rewrite.

mbauman · 2015-04-20T20:38:40Z

doc/manual/metaprogramming.rst

+   *types* of the arguments, not their values.
+
+3. Instead of calculating something or performing some action, you return
+   from a *quoted expression* which, when evaluated, does what you want.


"you return ~~from~~ a quoted expression"

tomasaschan · 2015-04-20T20:49:19Z

Thanks @mbauman for the speedy review!

mschauer · 2015-04-21T07:47:46Z

"Specialized method" is not really a clear concept. A method is a function specialized on type and for generating functions the generation part can be itself a method or a function, which returns then "further specialized methods", right?

According to comment by @mshauer - I have attempted to adjust this passage to the Julia terminology, hoping that I didn't make it difficult to read in the process :)

tomasaschan · 2015-04-21T15:22:53Z

@mschauer I failed to spell your handle right in the commit message, but that last commit is supposed to fix the wording re: "specialized methods". I hope it's better now :)

mbauman · 2015-04-21T15:29:50Z

👍 That is less jargon-y and more readable, I think. Nicely done.

One last bug: the PR title. Do you still consider this a WIP?

tomasaschan · 2015-04-21T19:37:06Z

Nope - not more than any documentation ever ;)

prcastro · 2015-04-22T01:15:47Z

👍

@generated

Document @generated functions

timholy reviewed Mar 29, 2015
View reviewed changes

garrison mentioned this pull request Mar 30, 2015

document staged functions #9489

Closed

mauro3 reviewed Mar 30, 2015
View reviewed changes

ViralBShah added the docs This change adds or pertains to documentation label Mar 31, 2015

ViralBShah added this to the 0.4.0 milestone Mar 31, 2015

tomasaschan force-pushed the doc-stagedfunctions branch from 7618581 to cd19d47 Compare April 6, 2015 11:10

goretkin reviewed Apr 6, 2015
View reviewed changes

tomasaschan force-pushed the doc-stagedfunctions branch from c42321b to 704aa64 Compare April 7, 2015 15:30

vtjnash reviewed Apr 16, 2015
View reviewed changes

tomasaschan force-pushed the doc-stagedfunctions branch 3 times, most recently from 1b9500b to 6c544ca Compare April 17, 2015 09:01

mbauman mentioned this pull request Apr 18, 2015

Rename stagedfunction → ＠generated function #10884

Merged

mbauman reviewed Apr 20, 2015
View reviewed changes

Tomas Lycken and others added 4 commits April 20, 2015 22:23

Update according to comments [av skip]

b5e423c

* Make `sub2ind_staged_impl` define `N` correctly * Add a note that staging *might* occur more than once.

Update according to comments by @vtjnash [av skip]

45db851

Change terminology according to JuliaLang#10884 [av skip]

c4719d2

tomasaschan force-pushed the doc-stagedfunctions branch from 6c544ca to c4719d2 Compare April 20, 2015 20:24

mbauman reviewed Apr 20, 2015
View reviewed changes

Tomas Lycken added 2 commits April 20, 2015 22:53

Fix RST formatting + typos [av skip]

bd47703

Update example with new tuple syntax [av skip]

59a8905

tomasaschan force-pushed the doc-stagedfunctions branch from 331a156 to 59a8905 Compare April 20, 2015 20:54

tomasaschan changed the title ~~WIP: Document stagedfunctions~~ WIP: Document @generated functions Apr 20, 2015

Re-word a passage on methods [av skip]

c96d476

According to comment by @mshauer - I have attempted to adjust this passage to the Julia terminology, hoping that I didn't make it difficult to read in the process :)

tomasaschan changed the title ~~WIP: Document @generated functions~~ Document @generated functions Apr 21, 2015

jakebolewski added a commit that referenced this pull request Apr 22, 2015

Merge pull request #10673 from tlycken/doc-stagedfunctions

aafdc50

Document @generated functions

jakebolewski merged commit aafdc50 into JuliaLang:master Apr 22, 2015

Document @generated functions #10673

Document @generated functions #10673

Conversation

tomasaschan commented Mar 29, 2015

tomasaschan commented Mar 29, 2015

timholy commented Mar 29, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbauman commented Mar 29, 2015

timholy commented Mar 29, 2015

tomasaschan commented Mar 30, 2015

Choose a reason for hiding this comment

timholy commented Mar 30, 2015

tomasaschan commented Mar 31, 2015

mauro3 commented Mar 31, 2015

mauro3 commented Mar 31, 2015

tomasaschan commented Apr 1, 2015

mauro3 commented Apr 1, 2015

tomasaschan commented Apr 6, 2015

timholy commented Apr 6, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomasaschan commented Apr 7, 2015

tomasaschan commented Apr 7, 2015

tomasaschan commented Apr 7, 2015

timholy commented Apr 7, 2015

tomasaschan commented Apr 7, 2015

mbauman commented Apr 7, 2015

pao commented Apr 7, 2015

tomasaschan commented Apr 7, 2015

tomasaschan commented Apr 12, 2015

Choose a reason for hiding this comment

tomasaschan commented Apr 17, 2015

mbauman commented Apr 20, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomasaschan commented Apr 20, 2015

Choose a reason for hiding this comment

tomasaschan commented Apr 20, 2015

mschauer commented Apr 21, 2015

tomasaschan commented Apr 21, 2015

mbauman commented Apr 21, 2015

tomasaschan commented Apr 21, 2015

prcastro commented Apr 22, 2015