Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Declarative macro repetition counts #28

Closed
markbt opened this issue Jun 24, 2020 · 6 comments · Fixed by #45
Closed

Declarative macro repetition counts #28

markbt opened this issue Jun 24, 2020 · 6 comments · Fixed by #45
Assignees
Labels
charter-needed A project proposal that has been approved as a new project group, but no charter has been written. major-change Major change proposal T-lang

Comments

@markbt
Copy link
Contributor

markbt commented Jun 24, 2020

WARNING

The Major Change Process was proposed in RFC 2936 and is not yet in
full operation. This template is meant to show how it could work.

Proposal

Summary

Add syntax to declarative macros to allow determination of the number of metavariable repetitions.

Motivation, use-cases, and solution sketches

Macros with repetitions often expand to code that needs to know or could
benefit from knowing how many repetitions there are. Consider the standard
sample macro to create a vector, recreating the standard library vec! macro:

macro_rules! myvec {
    ($($value:expr),* $(,)?) => {
        {
            let mut v = Vec::new();
            $(
                v.push($value);
            )*
            v
        }
    };
}

This would be more efficient if it could use Vec::with_capacity to
preallocate the vector with the correct length. However, there is no standard
facility in declarative macros to achieve this.

There are various ways to work around this limitation. Some common approaches
that users take are listed below, along with some of their drawbacks.

Use recursion

Use a recursive macro to calculate the length.

macro_rules! count_exprs {
    () => {0usize};
    ($head:expr, $($tail:expr,)*) => {1usize + count_exprs!($($tail,)*)};
}

macro_rules! myvec {
    ($(value:expr),* $(,)?) => {
        {
            let size = count_exprs!($($value,)*);
            let mut v = Vec::with_capacity(size);
            $(
                v.push($value);
            )*
            v
        }
    };
}

Whilst this is among the first approaches that a novice macro programmer
might take, it is also the worst performing. It rapidly hits the recursion
limit, and if the recursion limit is raised, it takes more than 25 seconds to
compile a sequence of 2,000 items. Sequences of 10,000 items can crash
the compiler with a stack overflow.

Generate a sum of 1s

This example is courtesy of @dtolnay.
Create a macro expansion that results in an expression like 0 + 1 + ... + 1.
There are various ways to do this, but one example is:

macro_rules! myvec {
    ( $( $value:expr ),* $(,)? ) => {
        {
            let size = 0 { $( + { stringify!(value); 1 } ) };
            let mut v = Vec::with_capacity(size);
            $(
                v.push($value);
            )*
            v
        }
    };
}

This performs better than recursion, however large numbers of items still
cause problems. It takes nearly 4 seconds to compile a sequence of 2,000
items. Sequences of 10,000 items can still crash the compiler with a stack
overflow.

Generate a slice and take its length

This example is taken from
[https://danielkeep.github.io/tlborm/book/blk-counting.html]. Create a macro
expansion that results in a slice of the form [(), (), ... ()] and take its
length.

macro_rules! replace_expr {
    ($_t:tt $sub:expr) => {$sub};
}

macro_rules! myvec {
    ( $( $value:expr ),* $(,)? ) => {
        {
            let size = <[()]>::len(&[$(replace_expr!(($value) ())),*]);
            let mut v = Vec::with_capacity(size);
            $(
                v.push($value);
            )*
            v
        }
    };
}

This is more efficient, taking less than 2 seconds to compile 2,000 items,
and just over 6 seconds to compile 10,000 items.

Discoverability

Just considering the performance comparisons misses the point. While we
can work around these limitations with carefully crafted macros, for a
developer unfamiliar with the subtleties of macro expansions it is hard
to discover which is the most efficient way.

Furthermore, whichever method is used, code readability is harmed by the
convoluted expressions involved.

Proposal

The compiler already knows how many repetitions there are. What is
missing is a way to obtain it.

I propose we add syntax to allow this to be expressed directly:

macro_rules! myvec {
    ( $( $value:expr ),* $(,)? ) => {
        {
            let mut v = Vec::with_capacity($#value);
            $(
                v.push($value);
            )*
            v
        }
    };
}

The new "metavariable count" expansion $#ident expands to a literal
number equal to the number of times ident would be expanded at the depth
that it appears.

A prototype implementation indicates this compiles a 2,000 item sequence
in less than 1s, and a 10,000 item sequence in just over 2s.

Nested repetitions

In the case of nested repetitions, the value depends on the depth of the
metavariable count expansion, where it expands to the number of repetitions
at that level.

Consider a more complex nested example:

macro_rules! nested {
    ( $( { $( { $( $x:expr ),* } ),* } ),* ) => {
        {
            println!("depth 0: {} repetitions", $#x);
            $(
                println!("  depth 1: {} repetitions", $#x);
                $(
                    println!("    depth 2: {} repetitions", $#x);
                    $(
                        println!("      depth 3: x = {}", $x);
                    )*
                )*
            )*
        }
    };
}

And given a call of:

   nested! { { { 1, 2, 3, 4 }, { 5, 6, 7 }, { 8, 9 } },
             { { 10, 11, 12 }, { 13, 14 }, { 15 } } };

This program will print:

depth 0: 2 repetitions
  depth 1: 3 repetitions
    depth 2: 4 repetitions
      depth 3: x = 1
      depth 3: x = 2
      depth 3: x = 3
      depth 3: x = 4
    depth 2: 3 repetitions
      depth 3: x = 5
      depth 3: x = 6
      depth 3: x = 7
    depth 2: 2 repetitions
      depth 3: x = 8
      depth 3: x = 9
  depth 1: 3 repetitions
    depth 2: 3 repetitions
      depth 3: x = 10
      depth 3: x = 11
      depth 3: x = 12
    depth 2: 2 repetitions
      depth 3: x = 13
      depth 3: x = 14
    depth 2: 1 repetitions
      depth 3: x = 15

Alternative designs

The macro could expand to a usize literal (e.g. 3usize) rather than just
a number literal. This matches what the number is internally in the compiler,
and may help with type inferencing, but it would prevent users using
stringify!($#x) to get the number as a string.

In its simplest form, this only expands to the repetition count for a single level of nesting.
In the example above, if we wanted to know the total count of
repetitions (i.e., 15), we would be unable to do so easily. There are a
couple of alternatives we could use for this:

  • $#var could expand to the total count, rather than the count at the
    current level. But this would make it hard to find the count at a particular
    level, which is also useful.

  • We could use the number of '#' characters to indicate the number of depths
    to sum over. In the example above, at the outer-most level, $#x expands
    to 2, $##x expands to 6, and $###x expands to 15.

The syntax being proposed is specific to counting the number of repetitions of
a metavariable, and isn't easily extensible to future ideas without more
special syntax. A more general form might be:

   ${count(ident)}

In this syntax extension, ${ ... } in macro expansions would contain
metafunctions that operate on the macro's definition itself. The syntax could
then be extended by future RFCs that add new metafunctions. Metafunctions
could take additional arguments, so the alternative to count repetitions at
multiple depths ($##x above) could be represented as ${count(x, 2)}.

There's nothing to preclude this being a later step, in which case $#ident
would become sugar for ${count(ident)}.

Links and related work

See https://danielkeep.github.io/tlborm/book/blk-counting.html for some common workarounds.

Declarative macros with repetition are commonly used in Rust for things that
are implemented using variadic functions in other languages.

  • In Java, variadic arguments are passed as an array, so the array's .length
    attribute can be used.

  • In dynamically-typed languages like Perl and Python, variadic arguments are
    passed as lists, and the usual length operations can be used there, too.

The syntax is similar to obtaining the length of strings and arrays in Bash:

string=some-string
array=(1 2 3)
echo ${#string}  # 11
echo ${#array[@]}  # 3

Similarly, the variable $# contains the number of arguments in a function.

The Major Change Process

Once this MCP is filed, a Zulip topic will be opened for discussion. Ultimately, one of the following things can happen:

  • If this is a small change, and the team is in favor, it may be approved to be implemented directly, without the need for an RFC.
  • If this is a larger change, then someone from the team may opt to work with you and form a project group to work on an RFC (and ultimately see the work through to implementation).
  • Alternatively, it may be that the issue gets closed without being accepted. This could happen because:
    • There is no bandwidth available to take on this project right now.
    • The project is not a good fit for the current priorities.
    • The motivation doesn't seem strong enough to justify the change.

You can read [more about the lang-team MCP process on forge].

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

@markbt markbt added T-lang major-change Major change proposal labels Jun 24, 2020
@rustbot
Copy link
Collaborator

rustbot commented Jun 24, 2020

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

@programmerjake
Copy link
Member

One concern is that the $#var syntax conflicts with the quote! macro's syntax for substituting a variable, which is #var.

@Lokathor
Copy link

Can you do ## or something to get an "escaped" hash into the output?

@programmerjake
Copy link
Member

Can you do ## or something to get an "escaped" hash into the output?

not according to the docs for quote!

@pnkfelix
Copy link
Member

pnkfelix commented Jun 29, 2020

Just as a heads up: The essential idea proposed here was previously proposed (ages ago, prior to Rust 1.0) in rust-lang/rfcs#88. There may be value in double-checking the contents of that RFC PR and/or its associated comment thread.

(Update: I will put more specific feedback on this proposal, as informed by rust-lang/rfcs#88, directly on the zulip stream itself.)

@nikomatsakis
Copy link
Contributor

Hey @rust-lang/lang -- we haven't seen anyone volunteering to serve as liaison for this project. If we're not able to find a liaison in the next week or so, I'm going to close the issue. If you think you might like to serve as a liaison, even if you don't have bandwidth right now, please do speak up -- we can always put the proposal on the "deferred list" to pick up when we have time.

@nikomatsakis nikomatsakis added the charter-needed A project proposal that has been approved as a new project group, but no charter has been written. label Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
charter-needed A project proposal that has been approved as a new project group, but no charter has been written. major-change Major change proposal T-lang
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants