Skip to content

Commit

Permalink
Merge nim-lang#605
Browse files Browse the repository at this point in the history
605: template non-ast param limited to lookup substitution r=zerbina a=saem

## Summary

* non-ast params (excluding `typed` and `untyped`) are only substituted
  for usage positions
* where a usage positionis is any syntactic position where a symbol is
  not intended to be introduced
* in additition, the manual and semtempl module have some doc comments
  outlining the new direction for templates

## Details

Starting with a motivating example, this now works after the change:

```nim
template foo(data: int): untyped =
  proc bar(data: int): int =
    2 * data
  bar(data)
doAssert foo(2) == 4
```

The reason this works now is `data` the template arg is a non-ast type,
it's not substituted into `data` the proc param, and `data` the proc
param is instead converted into a symbol. This means that when `data`
the proc body usage is looked up, it's lexically resolved with `data`
the proc param. Finally, when `data` the call argument's usage is looked
up, the template param is found and substituted.

### Further Details

Template parameters that are neither `typed` or `untyped` will no
longer subsititute for definitional syntax positions, or syntax
positions that meant to introduce routines, variables, types and other
named definitions. For non-ast typed substitution lexical scope based
lookups are used. AST typed template parameter substitutions continue to
work as is, since they're arbitrary symbolic substitutions.

This makes more sense given that we're substituting syntactic symbol
usage as opposed to syntactic symbol introductions; but it also removes
a vector by which symbol pollution can take place.

As templates have lots of rough edges in implementation there are still
many bugs, but these are almost entirely pre-existing.

### Overview of Concepts Introduced:

This is a brief overview, see the module and docs for more info, also
this is developing, these are still mostly in docs:
* the concept of template substitution being policy being set by
  template output type has been introduced
* a template body is to be thought of as an out parameter with the same
  type as the template output
* additionally, a template body can be considered as a form of quasi
  quoting, where the type determines how we treat the substitution of
  various kinds of parameters (ast and non-ast types).



Co-authored-by: saem <saemghani+github@gmail.com>
  • Loading branch information
bors[bot] and saem authored Mar 28, 2023
2 parents 59a30b5 + aac2b0c commit 483726f
Show file tree
Hide file tree
Showing 5 changed files with 129 additions and 49 deletions.
12 changes: 5 additions & 7 deletions compiler/sem/sem.nim
Original file line number Diff line number Diff line change
Expand Up @@ -358,18 +358,16 @@ proc newSymG*(kind: TSymKind, n: PNode, c: PContext): PSym =
# and sfGenSym in n.sym.flags:
result = n.sym
if result.kind notin {kind, skTemp}:
# xxx: this happens because a macro, or possibly template, produces a
# mismatched symbol, if it's the compiler that's an outright bug.
# instead of logging it here, we need to ensure that macros API
# doesn't allow this to happen in the first place and/or detect this
# much earlier.
localReport(c.config, n.info, SemReport(
kind: rsemSymbolKindMismatch,
sym: result,
expectedSymbolKind: {kind}))

when false:
if sfGenSym in result.flags and result.kind notin {skTemplate, skMacro, skParam}:
# declarative context, so produce a fresh gensym:
result = copySym(result)
result.ast = n.sym.ast
put(c.p, n.sym, result)

# when there is a nested proc inside a template, semtmpl
# will assign a wrong owner during the first pass over the
# template; we must fix it here: see #909
Expand Down
81 changes: 66 additions & 15 deletions compiler/sem/semtempl.nim
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,51 @@

# included from sem.nim

## The current implementation of templates might not yet conform to the
## description that follows.
##
## Template Basics
## ===============
##
## Given a basic template as follows:
## ..code::
## template identity(arg: untyped): untyped =
## arg
##
## The template's output type is `untyped`, meaning the body (`arg`) is treated
## as an `untyped` AST fragment, that substitution of parameters will evaluate
## per the rules of `untyped` templates, and finally evaluation and insertion
## of the template at the callsite will be hygeinic. The template parameter
## `arg` will be captured as `untyped`, meaning no attempt will be made to
## semantically analyse the parameter prior to substitution.
##
## Template Taxonomy
## =================
##
## There are at least four types of templates across two categories:
## - AST templates:
## - `untyped`
## - `dirty` a non-hygienic sub-variant
## - `typed`
## - expression templates (all types that are not `untyped` or `typed`)
##
## Substitution Positions
## ----------------------
## Templates are ultimately AST level constructs regardless of output type,
## even they follow the grammar. There are two types of positions in a template
## body, one is `definition` and the other is `usage`. A `definition` are any
## position where the grammar construct is intended to introduce a new symbol,
## i.e.: the name of a routine, including its parameters; names of variables
## (`const`, `let`, `var`), and so on. All other sites are `usage` sites, where
## a symbol of "chunk" of AST might be used.
##
## This is a draft of subsitution rules:
## - `untyped` template bodies accept `typed` and `untyped` params in
## definition or usage positions; and all other params are usage only
## - `typed` template bodies accept `typed` and `untyped` params in definition
## or usage positions; and all other params are usage only
## - non-ast template bodies only allow subsitutions within usage positions

discard """
hygienic templates:
Expand Down Expand Up @@ -164,6 +209,7 @@ proc replaceIdentBySym(c: PContext; n: var PNode, s: PNode) =

type
TemplCtx = object
## Context used during template definition evaluation
c: PContext
toBind, toMixin, toInject: IntSet
owner: PSym
Expand All @@ -175,7 +221,6 @@ type
proc getIdentNode(c: var TemplCtx, n: PNode): PNode =
## gets the ident node, will mutate `n` if it's an `nkPostfix` or
## `nkPragmaExpr` and there is an error (return an nkError).

case n.kind
of nkPostfix:
result = getIdentNode(c, n[1])
Expand All @@ -189,7 +234,6 @@ proc getIdentNode(c: var TemplCtx, n: PNode): PNode =
result = c.c.config.wrapError(n)
of nkIdent:
let s = qualifiedLookUp(c.c, n, {})

if s.isNil:
result = n
elif s.isError:
Expand All @@ -207,11 +251,15 @@ proc getIdentNode(c: var TemplCtx, n: PNode): PNode =
expectedKinds: {nkPostfix, nkPragmaExpr, nkIdent,
nkAccQuoted}))


proc isTemplParam(c: TemplCtx, n: PNode): bool {.inline.} =
func isTemplParam(c: TemplCtx, n: PNode): bool {.inline.} =
## True if `n` is a parameter symbol of the current template.
result = n.kind == nkSym and n.sym.kind == skParam and
n.sym.owner == c.owner and sfTemplateParam in n.sym.flags
n.kind == nkSym and n.sym.kind == skParam and n.sym.owner == c.owner and
sfTemplateParam in n.sym.flags

func definitionTemplParam(c: TemplCtx, n: PNode): bool {.inline.} =
## True if `n` is an AST typed (`typed`/`untyped`) parameter symbol of the
## current template
isTemplParam(c, n) and n.sym.typ.kind in {tyUntyped, tyTyped}

proc semTemplBody(c: var TemplCtx, n: PNode): PNode

Expand Down Expand Up @@ -295,10 +343,10 @@ proc addLocalDecl(c: var TemplCtx, n: var PNode, k: TSymKind) =
of nkError:
n = ident
else:
if not isTemplParam(c, ident):
c.toInject.incl(x.ident.id)
else:
if definitionTemplParam(c, ident):
replaceIdentBySym(c.c, n, ident)
else:
c.toInject.incl(x.ident.id)

else:
var hasError = false
Expand Down Expand Up @@ -331,7 +379,9 @@ proc addLocalDecl(c: var TemplCtx, n: var PNode, k: TSymKind) =
of nkError:
n = ident
else:
if not isTemplParam(c, ident):
if definitionTemplParam(c, ident):
replaceIdentBySym(c.c, n, ident)
else:
if n.kind != nkSym:
let local = newGenSym(k, ident, c)

Expand All @@ -346,8 +396,6 @@ proc addLocalDecl(c: var TemplCtx, n: var PNode, k: TSymKind) =

if k == skParam and c.inTemplateHeader > 0:
local.flags.incl sfTemplateParam
else:
replaceIdentBySym(c.c, n, ident)

if hasError and n.kind != nkError:
n = c.c.config.wrapError(n)
Expand Down Expand Up @@ -435,14 +483,14 @@ proc semRoutineInTemplBody(c: var TemplCtx, n: PNode, k: TSymKind): PNode =

if ident.isError:
n[namePos] = ident
elif not isTemplParam(c, ident):
elif definitionTemplParam(c, ident):
n[namePos] = ident
else:
var s = newGenSym(k, ident, c)
s.ast = n
addPrelimDecl(c.c, s)
styleCheckDef(c.c.config, n.info, s)
n[namePos] = newSymNode(s, n[namePos].info)
else:
n[namePos] = ident
else:
n[namePos] = semRoutineInTemplName(c, n[namePos])

Expand All @@ -457,6 +505,9 @@ proc semRoutineInTemplBody(c: var TemplCtx, n: PNode, k: TSymKind): PNode =
if n[i].isError:
hasError = true

# xxx: special handling for templates within `untyped` output templates
# doesn't make sense, it's just untyped AST. For `typed` or `expression`
# templates they should be analysed.
if k == skTemplate: inc(c.inTemplateHeader)
n[paramsPos] = semTemplBody(c, n[paramsPos])
if k == skTemplate: dec(c.inTemplateHeader)
Expand Down
40 changes: 33 additions & 7 deletions doc/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5364,9 +5364,9 @@ symbols by a `bind` statement inside `genericB`.
Templates
=========

A template is a simple form of a macro: It is a simple substitution
mechanism that operates on Nim's abstract syntax trees. It is processed in
the semantic pass of the compiler.
A template is a form of metaprogramming: a template call evaluates to a
|Nimskull| abstract syntax tree that is substituted in place of the call. The
evaluation and substitution is done during semantic pass of the compiler.

The syntax to *invoke* a template is the same as calling a procedure.

Expand All @@ -5386,10 +5386,32 @@ templates:
| `a in b` is transformed into `contains(b, a)`.
| `notin` and `isnot` have the obvious meanings.
The "types" of templates can be the symbols `untyped`,
`typed` or `typedesc`. These are "meta types", they can only be used in certain
contexts. Regular types can be used too; this implies that `typed` expressions
are expected.
The "types" of templates can be the symbols `untyped`, `typed` or `typedesc`.
These are "meta types", they can only be used in certain contexts. Regular
types can be used too; this implies that `typed` expressions are expected.

**Future directions**: the output type of a template is the output type of the
template body, which itself can be thought of as an out parameter. Templates
will be classified into two major categories AST output (`untyped` and `typed`)
and expression based (other types). Along with substitution positions (see
below) template evaluation will be revised as follows:
- `untyped` template: allow `typed` and `untyped` params in defining or
using positions; and all other params only in using positions
- `typed` template: allow `typed` and `untyped` params in defining or using
positions; and all other params only in using positions
- non-ast template: only allow substitution in the using positions
The above direction describes the nuance that will be incorporated into a
broader redesign of how templates work in |Nimskull|.

Defining vs Using Positions
---------------------------

Substitution positions are places in the template body where template parameter
substitution can take place. There are two substitution positions definition
and usage, also referred to as definitional/defining/define or using/use,
respectively. Definitional positions is any syntactic position intended to
define new names (e.g.: routine, variable, parameter, type, field names), while
usage positions are all other positions where an identifier is to be looked up.


Typed vs untyped parameters
Expand Down Expand Up @@ -5418,6 +5440,10 @@ performed before the expression is passed to the template. This means that
declareInt(x) # invalid, because x has not been declared and so it has no type
`typed` and `untyped` parameters may appear in defining or using symbol
positions, while all other parameters are only substituted for using symbol
positions.

A template where every parameter is `untyped` is called an `immediate`:idx:
template. For historical reasons, templates can be explicitly annotated with
an `immediate` pragma and then these templates do not take part in
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
discard """
description: '''
Templates parameters of non-AST type do not replace identifiers in new symbol
definition positions. Meaning a template parameter that is not `untyped` or
`typed` will not substitute for a matching identifier if defining things like
variables, routines, parameters, types, fields, etc.
'''
"""

block originally_this_did_not_work_now_it_does:
# this was kept for historical reasons and can be replaced, when this was an
# error it originated from https://github.com/nim-lang/nim/issues/3158
type
MyData = object
x: int

template newDataWindow(data: ref MyData): untyped =
proc testProc(data: ref MyData): string =
"Hello, " & $data.x
testProc(data)

var d: ref MyData
new(d)
d.x = 10
doAssert newDataWindow(d) == "Hello, 10"
20 changes: 0 additions & 20 deletions tests/lang_callable/template/twrongsymkind.nim

This file was deleted.

0 comments on commit 483726f

Please sign in to comment.