From aac2b0cd509e558022b6a111c212ef33b961ad5b Mon Sep 17 00:00:00 2001 From: saem Date: Mon, 27 Mar 2023 00:04:05 -0700 Subject: [PATCH] limit template non-ast param to usage substitution Summary: * non-ast params (excluding `typed` and `untyped`) are only substituted for usage positions * where a usage positionis is any syntactic position where a symbol is not intended to be introduced * in additition, the manual and semtempl module have some doc comments outlining the new direction for templates Motivating Example: After this change the following code now works as expected: ```nim template foo(data: int): untyped = proc bar(data: int): int = 2 * data bar(data) doAssert foo(2) == 4 ``` The reason this works now is `data` the template arg is a non-ast type, it's not substituted into `data` the proc param, and `data` the proc param is instead converted into a symbol. This means that when `data` the proc body usage is looked up, it's lexically resolved with `data` the proc param. Finally, when `data` the call argument's usage is looked up, the template param is found and substituted. Details: Template parameters that are neither `typed` or `untyped` will no longer subsititute for definitional syntax positions, or syntax positions that meant to introduce routines, variables, types and other named definitions. For non-ast typed substitution lexical scope based lookups are used. AST typed template parameter substitutions continue to work as is, since they're arbitrary symbolic substitutions. This makes more sense given that we're substituting syntactic symbol usage as opposed to syntactic symbol introductions; but it also removes a vector by which symbol pollution can take place. As templates have lots of rough edges in implementation there are still many bugs, but these are almost entirely pre-existing. Overview of Concepts Introduced: This is a brief overview, see the module and docs for more info, also this is developing, these are still mostly in docs: * the concept of template substitution being policy being set by template output type has been introduced * a template body is to be thought of as an out parameter with the same type as the template output * additionally, a template body can be considered as a form of quasi quoting, where the type determines how we treat the substitution of various kinds of parameters (ast and non-ast types). --- compiler/sem/sem.nim | 12 ++- compiler/sem/semtempl.nim | 81 +++++++++++++++---- doc/manual.rst | 40 +++++++-- ...usage_substitution_nonast_typed_params.nim | 25 ++++++ .../lang_callable/template/twrongsymkind.nim | 20 ----- 5 files changed, 129 insertions(+), 49 deletions(-) create mode 100644 tests/lang_callable/template/template_usage_substitution_nonast_typed_params.nim delete mode 100644 tests/lang_callable/template/twrongsymkind.nim diff --git a/compiler/sem/sem.nim b/compiler/sem/sem.nim index c321f58a8055..0e68c42689ec 100644 --- a/compiler/sem/sem.nim +++ b/compiler/sem/sem.nim @@ -358,18 +358,16 @@ proc newSymG*(kind: TSymKind, n: PNode, c: PContext): PSym = # and sfGenSym in n.sym.flags: result = n.sym if result.kind notin {kind, skTemp}: + # xxx: this happens because a macro, or possibly template, produces a + # mismatched symbol, if it's the compiler that's an outright bug. + # instead of logging it here, we need to ensure that macros API + # doesn't allow this to happen in the first place and/or detect this + # much earlier. localReport(c.config, n.info, SemReport( kind: rsemSymbolKindMismatch, sym: result, expectedSymbolKind: {kind})) - when false: - if sfGenSym in result.flags and result.kind notin {skTemplate, skMacro, skParam}: - # declarative context, so produce a fresh gensym: - result = copySym(result) - result.ast = n.sym.ast - put(c.p, n.sym, result) - # when there is a nested proc inside a template, semtmpl # will assign a wrong owner during the first pass over the # template; we must fix it here: see #909 diff --git a/compiler/sem/semtempl.nim b/compiler/sem/semtempl.nim index 8a04e2998ea3..23156c1207d7 100644 --- a/compiler/sem/semtempl.nim +++ b/compiler/sem/semtempl.nim @@ -9,6 +9,51 @@ # included from sem.nim +## The current implementation of templates might not yet conform to the +## description that follows. +## +## Template Basics +## =============== +## +## Given a basic template as follows: +## ..code:: +## template identity(arg: untyped): untyped = +## arg +## +## The template's output type is `untyped`, meaning the body (`arg`) is treated +## as an `untyped` AST fragment, that substitution of parameters will evaluate +## per the rules of `untyped` templates, and finally evaluation and insertion +## of the template at the callsite will be hygeinic. The template parameter +## `arg` will be captured as `untyped`, meaning no attempt will be made to +## semantically analyse the parameter prior to substitution. +## +## Template Taxonomy +## ================= +## +## There are at least four types of templates across two categories: +## - AST templates: +## - `untyped` +## - `dirty` a non-hygienic sub-variant +## - `typed` +## - expression templates (all types that are not `untyped` or `typed`) +## +## Substitution Positions +## ---------------------- +## Templates are ultimately AST level constructs regardless of output type, +## even they follow the grammar. There are two types of positions in a template +## body, one is `definition` and the other is `usage`. A `definition` are any +## position where the grammar construct is intended to introduce a new symbol, +## i.e.: the name of a routine, including its parameters; names of variables +## (`const`, `let`, `var`), and so on. All other sites are `usage` sites, where +## a symbol of "chunk" of AST might be used. +## +## This is a draft of subsitution rules: +## - `untyped` template bodies accept `typed` and `untyped` params in +## definition or usage positions; and all other params are usage only +## - `typed` template bodies accept `typed` and `untyped` params in definition +## or usage positions; and all other params are usage only +## - non-ast template bodies only allow subsitutions within usage positions + discard """ hygienic templates: @@ -164,6 +209,7 @@ proc replaceIdentBySym(c: PContext; n: var PNode, s: PNode) = type TemplCtx = object + ## Context used during template definition evaluation c: PContext toBind, toMixin, toInject: IntSet owner: PSym @@ -175,7 +221,6 @@ type proc getIdentNode(c: var TemplCtx, n: PNode): PNode = ## gets the ident node, will mutate `n` if it's an `nkPostfix` or ## `nkPragmaExpr` and there is an error (return an nkError). - case n.kind of nkPostfix: result = getIdentNode(c, n[1]) @@ -189,7 +234,6 @@ proc getIdentNode(c: var TemplCtx, n: PNode): PNode = result = c.c.config.wrapError(n) of nkIdent: let s = qualifiedLookUp(c.c, n, {}) - if s.isNil: result = n elif s.isError: @@ -207,11 +251,15 @@ proc getIdentNode(c: var TemplCtx, n: PNode): PNode = expectedKinds: {nkPostfix, nkPragmaExpr, nkIdent, nkAccQuoted})) - -proc isTemplParam(c: TemplCtx, n: PNode): bool {.inline.} = +func isTemplParam(c: TemplCtx, n: PNode): bool {.inline.} = ## True if `n` is a parameter symbol of the current template. - result = n.kind == nkSym and n.sym.kind == skParam and - n.sym.owner == c.owner and sfTemplateParam in n.sym.flags + n.kind == nkSym and n.sym.kind == skParam and n.sym.owner == c.owner and + sfTemplateParam in n.sym.flags + +func definitionTemplParam(c: TemplCtx, n: PNode): bool {.inline.} = + ## True if `n` is an AST typed (`typed`/`untyped`) parameter symbol of the + ## current template + isTemplParam(c, n) and n.sym.typ.kind in {tyUntyped, tyTyped} proc semTemplBody(c: var TemplCtx, n: PNode): PNode @@ -295,10 +343,10 @@ proc addLocalDecl(c: var TemplCtx, n: var PNode, k: TSymKind) = of nkError: n = ident else: - if not isTemplParam(c, ident): - c.toInject.incl(x.ident.id) - else: + if definitionTemplParam(c, ident): replaceIdentBySym(c.c, n, ident) + else: + c.toInject.incl(x.ident.id) else: var hasError = false @@ -331,7 +379,9 @@ proc addLocalDecl(c: var TemplCtx, n: var PNode, k: TSymKind) = of nkError: n = ident else: - if not isTemplParam(c, ident): + if definitionTemplParam(c, ident): + replaceIdentBySym(c.c, n, ident) + else: if n.kind != nkSym: let local = newGenSym(k, ident, c) @@ -346,8 +396,6 @@ proc addLocalDecl(c: var TemplCtx, n: var PNode, k: TSymKind) = if k == skParam and c.inTemplateHeader > 0: local.flags.incl sfTemplateParam - else: - replaceIdentBySym(c.c, n, ident) if hasError and n.kind != nkError: n = c.c.config.wrapError(n) @@ -435,14 +483,14 @@ proc semRoutineInTemplBody(c: var TemplCtx, n: PNode, k: TSymKind): PNode = if ident.isError: n[namePos] = ident - elif not isTemplParam(c, ident): + elif definitionTemplParam(c, ident): + n[namePos] = ident + else: var s = newGenSym(k, ident, c) s.ast = n addPrelimDecl(c.c, s) styleCheckDef(c.c.config, n.info, s) n[namePos] = newSymNode(s, n[namePos].info) - else: - n[namePos] = ident else: n[namePos] = semRoutineInTemplName(c, n[namePos]) @@ -457,6 +505,9 @@ proc semRoutineInTemplBody(c: var TemplCtx, n: PNode, k: TSymKind): PNode = if n[i].isError: hasError = true + # xxx: special handling for templates within `untyped` output templates + # doesn't make sense, it's just untyped AST. For `typed` or `expression` + # templates they should be analysed. if k == skTemplate: inc(c.inTemplateHeader) n[paramsPos] = semTemplBody(c, n[paramsPos]) if k == skTemplate: dec(c.inTemplateHeader) diff --git a/doc/manual.rst b/doc/manual.rst index 6dba7689d38d..bc788ec431a8 100644 --- a/doc/manual.rst +++ b/doc/manual.rst @@ -5364,9 +5364,9 @@ symbols by a `bind` statement inside `genericB`. Templates ========= -A template is a simple form of a macro: It is a simple substitution -mechanism that operates on Nim's abstract syntax trees. It is processed in -the semantic pass of the compiler. +A template is a form of metaprogramming: a template call evaluates to a +|Nimskull| abstract syntax tree that is substituted in place of the call. The +evaluation and substitution is done during semantic pass of the compiler. The syntax to *invoke* a template is the same as calling a procedure. @@ -5386,10 +5386,32 @@ templates: | `a in b` is transformed into `contains(b, a)`. | `notin` and `isnot` have the obvious meanings. -The "types" of templates can be the symbols `untyped`, -`typed` or `typedesc`. These are "meta types", they can only be used in certain -contexts. Regular types can be used too; this implies that `typed` expressions -are expected. +The "types" of templates can be the symbols `untyped`, `typed` or `typedesc`. +These are "meta types", they can only be used in certain contexts. Regular +types can be used too; this implies that `typed` expressions are expected. + +**Future directions**: the output type of a template is the output type of the +template body, which itself can be thought of as an out parameter. Templates +will be classified into two major categories AST output (`untyped` and `typed`) +and expression based (other types). Along with substitution positions (see +below) template evaluation will be revised as follows: +- `untyped` template: allow `typed` and `untyped` params in defining or + using positions; and all other params only in using positions +- `typed` template: allow `typed` and `untyped` params in defining or using + positions; and all other params only in using positions +- non-ast template: only allow substitution in the using positions +The above direction describes the nuance that will be incorporated into a +broader redesign of how templates work in |Nimskull|. + +Defining vs Using Positions +--------------------------- + +Substitution positions are places in the template body where template parameter +substitution can take place. There are two substitution positions definition +and usage, also referred to as definitional/defining/define or using/use, +respectively. Definitional positions is any syntactic position intended to +define new names (e.g.: routine, variable, parameter, type, field names), while +usage positions are all other positions where an identifier is to be looked up. Typed vs untyped parameters @@ -5418,6 +5440,10 @@ performed before the expression is passed to the template. This means that declareInt(x) # invalid, because x has not been declared and so it has no type +`typed` and `untyped` parameters may appear in defining or using symbol +positions, while all other parameters are only substituted for using symbol +positions. + A template where every parameter is `untyped` is called an `immediate`:idx: template. For historical reasons, templates can be explicitly annotated with an `immediate` pragma and then these templates do not take part in diff --git a/tests/lang_callable/template/template_usage_substitution_nonast_typed_params.nim b/tests/lang_callable/template/template_usage_substitution_nonast_typed_params.nim new file mode 100644 index 000000000000..302fbb4b5b78 --- /dev/null +++ b/tests/lang_callable/template/template_usage_substitution_nonast_typed_params.nim @@ -0,0 +1,25 @@ +discard """ + description: ''' +Templates parameters of non-AST type do not replace identifiers in new symbol +definition positions. Meaning a template parameter that is not `untyped` or +`typed` will not substitute for a matching identifier if defining things like +variables, routines, parameters, types, fields, etc. +''' +""" + +block originally_this_did_not_work_now_it_does: + # this was kept for historical reasons and can be replaced, when this was an + # error it originated from https://github.com/nim-lang/nim/issues/3158 + type + MyData = object + x: int + + template newDataWindow(data: ref MyData): untyped = + proc testProc(data: ref MyData): string = + "Hello, " & $data.x + testProc(data) + + var d: ref MyData + new(d) + d.x = 10 + doAssert newDataWindow(d) == "Hello, 10" \ No newline at end of file diff --git a/tests/lang_callable/template/twrongsymkind.nim b/tests/lang_callable/template/twrongsymkind.nim deleted file mode 100644 index 5fa6189145b4..000000000000 --- a/tests/lang_callable/template/twrongsymkind.nim +++ /dev/null @@ -1,20 +0,0 @@ -discard """ - errormsg: "cannot use symbol of kind 'var' as a 'param'" - line: 20 -""" - -# bug #3158 - -type - MyData = object - x: int - -template newDataWindow(data: ref MyData): untyped = - proc testProc(data: ref MyData) = - echo "Hello, ", data.x - testProc(data) - -var d: ref MyData -new(d) -d.x = 10 -newDataWindow(d)