Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: WIP: Make macro hygiene easier to use and less error-prone #10940

Closed
wants to merge 1 commit into from

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Apr 22, 2015

This is a rebase and squash (plus updates) of PR #6910

TODO items:

  • handle new in type definitions in macros
  • handle fieldnames in type definitions in macros
  • move export of @hygienic to exports.jl
  • re-import doc changes from original PR
  • solicit feedback

@vtjnash vtjnash added this to the 0.4.0 milestone Apr 22, 2015
esc function and making hygiene automatic instead.

This is a rebase and squash (plus updates) of the following commits:

commit + c3af49c
Author: Dave Moon <david-moon@rcn.com>
Date:   Wed Jun 4 15:41:52 2014 -0400

    Update the entry for let in vars-introduced-by-patterns
    to use hygienic-symbol? instead of symbol?.

    env-for-expansion needs to use difference rather than diff
    to remove the globals, because hygienic-symbols are not
    interned, thus must be compared with equal? rather than eq?.

commit + 850746a
Author: Dave Moon <david-moon@rcn.com>
Date:   Wed Jun 4 15:19:43 2014 -0400

    Change :x to quote x end in lots of places so it is hygienic, also fix a merge error

commit + 5bba56d
Author: Dave Moon <david-moon@rcn.com>
Date:   Wed Jun 4 15:18:34 2014 -0400

    difference is to diff as assoc is to assq

commit + 97c11fe
Author: Dave Moon <david-moon@rcn.com>
Date:   Fri May 23 13:42:17 2014 -0400

    Use quote instead of Expr in help macro so it gets automatic hygiene the same as before my changes

commit + ac7240b
Author: Dave Moon <david-moon@rcn.com>
Date:   Fri May 23 13:41:17 2014 -0400

    Fix quoting problem in my previous commit

commit + 9e7e186
Author: Dave Moon <david-moon@rcn.com>
Date:   Fri May 23 13:40:38 2014 -0400

    Avoid symbol("@cmd") which is not hygienic

commit + dddb999
Author: Dave Moon <david-moon@rcn.com>
Date:   Fri May 23 13:39:36 2014 -0400

    Tweak hygiene doc

commit + 755fca0
Author: Dave Moon <david-moon@rcn.com>
Date:   Fri May 23 11:38:10 2014 -0400

    Fix error message interpolation in gen_call_with_extracted_types()

commit + 4dfd75c
Author: Dave Moon <david-moon@rcn.com>
Date:   Thu May 22 16:01:10 2014 -0400

    Improve documentation of esc, per code review

commit + 7ae26dd
Author: Dave Moon <david-moon@rcn.com>
Date:   Thu May 22 15:56:25 2014 -0400

    Add documen tation of splatting expression interpolation, make quote doc easier to find

commit + 467549c
Author: Dave Moon <david-moon@rcn.com>
Date:   Thu May 22 15:54:59 2014 -0400

    Remove some tabs I accidentally added in the last commit

commit + 5b2f162
Author: Dave Moon <david-moon@rcn.com>
Date:   Thu May 22 15:31:50 2014 -0400

    Fix this to work with the new auto-hygienic macros.

    Specifically:

    gen_call_with_extracted_types() is declared hygienic, and uses quote
    instead of manually constructing Expr's so the result will be hygienic
    as it was before the new hygienic macros.

    Fix it so "@which(a[i]=x)" will work.

    Fix the too-complex error message not to assume @which is the caller.

    Gratuitous rearrangement of code in the for fname loop.

    Also tests have been added to make testall to verify all this.  The
    tests failed with the new auto-hygienic macros before these fixes
    were put in.

commit + 5a7d3b8
Author: Dave Moon <david-moon@rcn.com>
Date:   Thu May 22 15:31:16 2014 -0400

    Add some tracing code (commented out) and fix some comments

commit + 9713b2b
Author: Dave Moon <david-moon@rcn.com>
Date:   Thu May 22 15:29:46 2014 -0400

    Add tests for some macros

commit + baeeb03
Author: Dave Moon <david-moon@rcn.com>
Date:   Wed May 21 13:05:50 2014 -0400

    Update documentation for new macro hygiene mechanism

commit + 8e07097
Author: Dave Moon <david-moon@rcn.com>
Date:   Tue May 20 16:17:15 2014 -0400

    Make macro hygiene easier to use.

    PURPOSE:

    The problem with Julia's current implementation of hygienic macros is
    that it depends on the author of a macro to put calls to esc() in all
    the right places.  Mistakes are hard to detect by testing because no
    failure will occur until there is an accidental name clash.  Macros
    are already hard to think about, no need to add unnecessary complications.

    STRATEGY:

    Deprecate esc() and make it not do anything.  Instead, quote marks all
    symbols that appear directly in the quote, rather than being interpolated,
    as hygienic.  Post-processing of the macro expansion will replace these
    with the appropriate gensym, symbol, or module reference.

    Make a distinction between hygienic and non-hygienic quote, because
    quote is not only used by macros.  Only hygienic quote does the above.
    Quote is hygienic when it is in the body of a macro or inside an
    @hygienic annotation.  The output of hygienic quote will only work as
    part of a macro expansion, it will not work with eval.

    Unary : always behaves the same as quote.

    As a special exception, quoting a single symbol never makes it
    hygienic.  This seems to be the best compromise to maximize usability
    without adding a new construct to reflect the fact that quoting a
    single symbol is sometimes used to construct executable code and other
    times used just to quote data.

    Fix the few macros in the base code that were broken by these changes.

    In the interest of simplicity, there are NO changes to the C code.

    TESTING:

    A clean build and make testall works on 32-bit Linux.

    RISKS:

    Did not test any packages that are not tested by make testall.
    There could be complex macros out there that will stop working
    because of these changes, analogous to @printf or @uv_write.

    The distinction between hygienic and non-hygienic quote could be
    confusing to users, although not as confusing as esc() was.

    esc() is deprecated but not marked as such.

    Calls to esc() are not removed from all over the source code.  That
    could be cleaned up later.

    Another problem is that the macro system has its own separate code
    walker to decide where to insert gensyms for hygiene.  This creates
    complexity and risk.  These changes are not fixing that, because to do
    so would require changing much more of the Julia implementation to
    allow symbols marked as hygienic to be used directly without
    replacement by gensyms.  However, if we did those changes, eval()
    would respect hygiene too and it would no longer be necessary to have
    distinct hygienic and non-hygienic versions of quote.

    PERFORMANCE IMPACT:

    Using assoc instead of assq is slower, but it only affects the macro
    expander.  There is also slightly more consing than there was before.
    I doubt that the difference will be large enough for anyone to notice.

    I did not give scm_to_julia() a special node type for hygienic symbol
    markers.  I doubt that not optimizing that costs much performance.

    PORTABILITY IMPACT:

    None expected.

    DOCUMENTATION IMPACT:

    Remove documentation of esc() and all mention of it.

    Document the distinction between hygienic and non-hygienic quote.

    Document @Hygiene.

    Document that $:x is a convenient way to insert a non-hygienic x
    into a hygienic quote.

    I will see if I can provide a suggested rewrite of the metaprogramming
    chapter to reflect these changes and make it easier to understand.

    All macro examples should be checked to remove esc and see if anything
    needs to be changed to keep them working.

    FILES CHANGED:

    src/julia-syntax.scm - New design for hygienic macros:
    - (hygienic <symbol>) represents an occurrence of <symbol> in
      a hygienic quote.  It will be replaced with the appropriate
      gensym, symbol, or module reference when the macro is expanded.
    - allow hygienic symbols as argument names, type parameter names,
      variables in pair-with-gensyms, and keyword argument names.  Keywords
      cannot be hygienic because the same symbol is both the name used
      by the caller and the variable bound in the callee (same as before).
    - Add an argument to expand-backquote() to tell it whether the quote
      is hygienic or not.  If hygienic, add hygienic markers to all
      symbols that are variable names, but don't add them to Expr heads.
      Don't use copyast of inert in hygienic mode since we need to add
      hygienic markers to symbols.
    - Add an argument to julia-expand-macros-() to tell it whether quotes
      are hygienic or not.  Support the @hygienic macro which uses the
      new with_hygiene Expr head.
    - The hygienic renaming env has to be searched with assoc rather than
      assq because uniqueness of (hygienic <symbol>) objects cannot
      easily be preserved through julia_to_scm() and scm_to_julia().
    - resolve-expansion-vars-() only replaces symbols marked as hygienic
      by wrapping in (hygienic <symbol>).  Plain symbols must have come
      from an interpolation of code supplied by a macro argument.

    base/LineEdit.jl - The bindings introduced by @keymap are supposed
    to be non-hygienic, make it so.

    base/REPLCompletions.jl - There is no longer an escape Expr, remove
    the check for it so the code will still work.

    base/base.jl - Neuter esc() but for convenience don't remove it yet.
    It's deprecated.

    base/expr.jl - Add @hygienic macro which makes any quotes in its
    body provide hygiene.  This means those quoted expressions will
    only work as part of a macro expansion, they will not work with eval.

    base/printf.jl - Add @hygienic to all subroutines of the @printf and
    @sprintf macros so those macros will still be hygienic.  Fix code in
    gen_d() to work around the fact that quote of just a symbol is no
    longer hygienic; the replacement code is clearer to read anyway.
    Fix @sprintf to use an explicit gensym to work around the same issue.

    base/process.jl - @setup_stdio and @cleanup_stdio are supposed to
    introduce non-hygienic bindings.

    base/stream.jl - @uv_write is supposed to introduce a non-hygienic
    binding of uvw.
# Test gen_call_with_extracted_types's alternate path for keyword arguments
@test Which_test.test2(1.0) == Tuple{Real}
@test Which_test.test2(123) == Tuple{Int}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace here

also would've been nice to preserve @david-moons git authorship of a large part of this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To preserve authorship this approach can be used: http://blog.spreedly.com/2014/06/24/merge-pull-request-considered-harmful/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can also run git commit --amend --author <original author> to fix this up, and had intended to do so before pushing. thanks for the reminder.

@JeffBezanson
Copy link
Member

I think this needs to be merged early in a release cycle.

@tkelman
Copy link
Contributor

tkelman commented Apr 22, 2015

Sounds familiar #6910 (comment)

Packages that make heavy use of macros (JuMP as an especially notable example) might need to make major changes to adapt to this. This would be a good candidate to run PkgEvaluator on prior to merging.

@pao
Copy link
Member

pao commented Apr 22, 2015

Should the milestone get pushed out to 0.5, so there's some chance this doesn't slip through the cracks in (fingers crossed) a few months?

@mlubin
Copy link
Member

mlubin commented Apr 23, 2015

Well, if we merge it soon then it will break JuMP, we'll fix it, and then we'll be able to quickly drop the old-style code after the 0.4 release. If we merge it at the start of the 0.5 cycle, it'll be less disruptive for users but we'll have to keep maintaining the old-style code for about a year until the 0.5 release. I'm ambivalent as to which is better, but either way please keep us in the loop and give us a chance to evaluate this before merging.

@JeffBezanson
Copy link
Member

I'm not convinced that this is a large improvement. @hygienic introduces a strange lack of referential transparency, where it suddenly matters where your quote expressions are, and not just what expressions they yield. You also end up wrapping some expressions in :hygienic. That may be better in some ways, but at the end of the day it's very similar to how we have to wrap some things in :escape now. My impression is that this is roughly as fiddly as what we have now.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 23, 2015

I think the argument is that fiddling with esc is non-intuitive. Whereas this actually interprets expressions in macros using rules that are closer to the context they are used in.

My understanding from the original PR is that @hygienic is a bit of a side-effect from not adding support to eval for unwrap hygienic Exprs, but that it too can go away with a bit more work.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 23, 2015

@mlubin do you have any sense for how badly this would break JuMP? one of the original goals of the PR was to be as non-disruptive to existing codebases as possible.

@JeffBezanson
Copy link
Member

a side-effect from not adding support to eval for unwrap hygienic Exprs, but that it too can go away with a bit more work

Then I think we should really not rush this, and take the time to do it right. My gut feel is that we are at capacity with things to worry about right now.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 23, 2015

sounds good. i wanted to rebase this, in part, just so it wouldn't get too far behind

@JeffBezanson
Copy link
Member

Yes, it's very nice to have it rebased.

@toivoh
Copy link
Contributor

toivoh commented Apr 23, 2015

Agreed, let's not rush this but think through how we really want things to work.

Having written macros whose implementations effectively span multiple modules I must say though, that the hygiene aspect that depends on where the quote block is in the code is highly compelling; I've basically had to fall back to the old pre-hygiene tricks in a number of such cases.

@JeffBezanson
Copy link
Member

@david-moon @toivoh I wonder if hygienic quote should resolve globals to the module containing the quote? I suppose to implement that we'd have to mark symbols as (hygienic ,module ,sym)?

@david-moon
Copy link

@david-moon @toivoh I wonder if hygienic quote should resolve globals to the module containing the quote? I suppose to implement that we'd have to mark symbols as (hygienic ,module ,sym) ?

Yes, symbols in the mark expansion that come from the macro definition rather than from the macro call should be looked up in the environment (including module) where the macro was defined. Did I forget to do that part when I prototyped this?

@JeffBezanson
Copy link
Member

We seem to be missing the manual section that was written for this.

@tkelman
Copy link
Contributor

tkelman commented Apr 21, 2016

Should the milestone get pushed out to 0.5, so there's some chance this doesn't slip through the cracks in (fingers crossed) a few months?

1 year later, this happened. Don't think this is still 0.5.

@stevengj
Copy link
Member

Any update on this? Would be good to have hygiene finalized for 1.0.

@StefanKarpinski
Copy link
Member

I think that a system like Clojure's with fully scoped symbols could be the simplest. It wouldn't even need to be particularly breaking since symbol literals like :foo could be "relative symbols" while if you wrote :(a + b) in a module M you would implicitly have M.a and M.b symbols. Of course, this forces us to stop punning on :a versus :(a) but I think that's for the best anyway.

@JeffBezanson
Copy link
Member

I'm not sure about that. It pretty drastically changes what a symbol is. How does it work out for, say, keyword arguments and field names?

@StefanKarpinski
Copy link
Member

That's a good question. I guess we should look at what Clojure does.

@StefanKarpinski
Copy link
Member

If you continue to consider symbols to be implicitly relative but introduce a notion of fully qualified identifiers then keyword arguments and field names could continue to be represented by symbols (implicitly relative) while when you write something like :(a) it would be resolved to :(M.a). Maybe this isn't much different than what this PR does anyway.

@StefanKarpinski
Copy link
Member

@david-moon sent me a message:


Stefan Karpinski wrote:

If you continue to consider symbols to be implicitly relative but introduce a notion of fully qualified identifiers then keyword arguments and field names could continue to be represented by symbols (implicitly relative) while when you write something like :(a) it would be resolved to :(M.a). Maybe this isn't much different than what this PR does anyway.

Yes, I think your choices for properly hygienic macros are either to draw a distinction between symbols (run-time objects) and identifiers (compile-time objects), or to do what I did (assuming I remember correctly, it's been years) and allow symbols to be interned in a dictionary specific to a particular macro expansion as well as in the usual global dictionary. This prevents symbols introduced by a macro from being eq to symbols in the macro call, but it requires normalizing those local symbols to global symbols when they end up as literals, keyword arguments, field names, etc. in the macro expansion. When those local symbols end up as bound variables, everything just works. When they end up as unbound variables, they need to be looked up in the scope where the macro was defined rather than in the scope where it was called.

Whatever you do, it shouldn't have a built-in restriction that a macro can only be defined in module scope, not in a scope nested inside a module. Even if for implementation reasons you need that restriction at first, it shouldn't be built deeply into the language. An identifier introduced by a macro should be in the scope where the macro was defined regardless of the nature or lifetime of that scope.

[I have a feeling this email reply is going to be prevented from going to the correct mailing list. Could you please forward it?]

--Dave Moon

@JeffBezanson
Copy link
Member

For globals, we do have a representation for fully-qualified identifiers: GlobalRef objects. I'm not sure it will help much to resolve :(a) to :(M.a) (a GlobalRef) very early, since you still need the surrounding context to know what the symbol actually refers to. But it does seem likely some version of this is similar to what's here, especially if we include #10940 (comment)

@stevengj
Copy link
Member

I guess this will not make 0.6, but it would be nice to have for 1.0

@vtjnash
Copy link
Member Author

vtjnash commented Oct 2, 2018

don't think this helpful anymore to have open, future work can just as easily start from the original (#6910)

@vtjnash vtjnash closed this Oct 2, 2018
@mauro3
Copy link
Contributor

mauro3 commented Nov 21, 2018

Should the original be re-opened? And its milestone updated to 2.0?

@DilumAluthge DilumAluthge deleted the moon/hygienic-macros branch March 25, 2021 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants