Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambdas that modify their closure: &$* #833

Closed
wolfseifert opened this issue Nov 15, 2023 · 22 comments
Closed

Lambdas that modify their closure: &$* #833

wolfseifert opened this issue Nov 15, 2023 · 22 comments

Comments

@wolfseifert
Copy link

When porting lambdas that modify their closure from Cpp1

#include <iostream>
using namespace ::std;
int main() {

  auto x = 5;
  auto y = 7;

  auto v = [&] {
    auto z = x;
    x = y;
    y = z;
  };

  cout << "x = " << x << ", y = " << y << endl;
  v();
  cout << "x = " << x << ", y = " << y << endl;
}

to Cpp2 the syntax gets pretty ugly

main: () = {
  using namespace ::std;

  x := 5;
  y := 7;

  v := :() = {
    z := x$;
    x&$* = y$;
    y&$* = z;
  };

  cout << "x = (x)$, y = (y)$" << endl;
  v();
  cout << "x = (x)$, y = (y)$" << endl;
}

This &$* was the only way I found to make it work.

When doing the same for nested lambdas

#include <iostream>
using namespace ::std;
int main() {

  auto x = 5;
  auto y = 7;

  auto v = [&] {
    auto w = [&] {
      auto z = x;
      x = y;
      y = z;
    };
    w();
  };

  cout << "x = " << x << ", y = " << y << endl;
  v();
  cout << "x = " << x << ", y = " << y << endl;
}

it gets even worse

main: () = {
  using namespace ::std;

  x := 5;
  y := 7;

  v := :() = {
    x2 := x&$;
    y2 := y&$;
    w := :() = {
      z := x2$*;
      x2$* = y2$*;
      y2$* = z;
    };
    w();
  };

  cout << "x = (x)$, y = (y)$" << endl;
  v();
  cout << "x = (x)$, y = (y)$" << endl;
}

This nested lambda example may seem artifical, but appeared when porting a real Cpp1 program over to Cpp2.

Is there any better syntax for this available? Do I miss something?

@msadeqhe
Copy link

That's it. It's the syntax of capture by pointer in Cpp2.

This topic is related to issue #247.

@AbhinavK00
Copy link

cpp2 should have a feature for reference capture, maybe captures could even be defined in terms of parameter passing conventions. I don't know why Herb didn't want to support capture by reference, maybe because of safety concerns but the current idiom isn't safer in any way.

@SebastianTroy
Copy link

SebastianTroy commented Nov 15, 2023 via email

@JohelEGP
Copy link
Contributor

I know a lot of people are pushing for terser and terser syntax at the moment, but for the most part, C++ is a complex language, not a scripting language, and I'd rather type a few more characters if it added nice signposts to my code, allowing me to parse it visually more easily, and flag important things where they are happening.

We can have the best of both worlds.
The lack of ceremony also means that what is relevant stands out more.
I'm a fan of :(x) x + y$, as the parameterized nature of the expression stands out more.

Also, post(size() == size()$ + 1) is fantastic.
You don't need to give size() a name and then use it.
And the o.type()$ in sfml_types.std::ranges::any_of(:(x) o.type()$.starts_with(x)) (#789 (reply in thread))
is performant; it is captured once and used N times.

@SebastianTroy
Copy link

SebastianTroy commented Nov 15, 2023 via email

@JohelEGP
Copy link
Contributor

As to your mention of performance, I don't see how postfix $ is changing performance in your example, it is identical to cpp1 [t = o.type()], and so would perform the same right?

That's right.
But you have to give it a name far from its use,
which isn't necessarily an improvement.

@JohelEGP
Copy link
Contributor

  v := :() = {
    x2 := x&$;
    y2 := y&$;
    w := :() = {
      z := x2$*;
      x2$* = y2$*;
      y2$* = z;
    };
    w();
  };

Capture the function expression instead (https://cpp2.godbolt.org/z/zne45Pb6o):

main: () = {
  using namespace ::std;

  x := 5;
  y := 7;

  v := :() = {
    w := :() = {
      z := x&$*;
      x&$* = y&$*;
      y&$* = z;
    }$;
    w();
  };

  cout << "x = (x)$, y = (y)$" << endl;
  v();
  cout << "x = (x)$, y = (y)$" << endl;
}
Program returned: 0
x = 5, y = 7
x = 7, y = 5

@JohelEGP
Copy link
Contributor

Also this limits capture to copy only, or at least be defined by the author, rather than allow the compiler to decide / optimise.

I recall reading the same comment on Cpp1 lambdas.
As Cpp2 demonstrates, explicit is better than implicit.
There are examples in Cpp1 where "leaving it to the optimizer" has not proven itself for ranges of users.
I recall TDEH and HALO.
Please, share the others.

@SebastianTroy
Copy link

SebastianTroy commented Nov 15, 2023 via email

@JohelEGP
Copy link
Contributor

You've just referenced two cases where the optimizer handles something, as proof that the optimizer isn't always useful?

As proof that relying on optimizations to make a feature viable in contexts where performance matters isn't a recipe for success.
Another example is std::ranges with its runtime performance and debugging experience (https://github.com/tcbrindle/flux improves on this area).

I'm making the same argument for capture that has been made for parameter passing. I the author want to program my intent, not fill my code with symbols to handle the exact type shenanigans required to incant it...

OK, that's much better.
In that case, I can get behind the idea of :(x, in y$ = someReallyLongName) x + y / y ^ y.

@JohelEGP
Copy link
Contributor

I'm making the same argument for capture that has been made for parameter passing. I the author want to program my intent, not fill my code with symbols to handle the exact type shenanigans required to incant it...

OK, that's much better.
In that case, I can get behind the idea of :(x, in y$ = someReallyLongName) x + y / y ^ y.

We already have the syntax in Cpp2.
We just need to make the terse forms the default.
See https://cpp2.godbolt.org/z/7aPcooxjG:

main: () = {
  i := 1;
  {  f := :(x) -> _ = { (j := i$) return x + j; };  assert(f(2) == 3);  }
//{  f := :(x)          (j := i$)        x + j;     assert(f(2) == 3);  }
//{  f := :(x) (j := i$) x + j;                     assert(f(2) == 3);  }
  {  f := :() = { (inout j := i&$*) { j++; j++; } };  f();  assert(i == 3);  i = 1;  }
//{  f := :()     (inout j := i&$*) { j++; j++; };    f();  assert(i == 3);  i = 1;  }
//{  f := :() (inout j := i&$*) {
//          j++;
//          j++;
//        };                                          f();  assert(i == 3);  i = 1;  }
  {  f := :() -> forward _ = { (inout j := i&$*) { j++; return j++; } };  assert(f()& == i&);  assert(i == 3);  i = 1;  }
//{  f := :() -> forward _     (inout j := i&$*) { j++; return j++; };    assert(f()& == i&);  assert(i == 3);  i = 1;  }
//{  f := :() -> forward _ (inout j := i&$*) {
//          j++;
//          return j++;
//        };                                                              assert(f()& == i&);  assert(i == 3);  i = 1;  }
}
  • :(x) -> _ = { (j := i$) return x + j; } can be defaulted to
    :(x) (j := i$) x + j;, and formatted it is
    :(x) (j := i$) x + j;.
  • :() = { (inout j := i&$*) { j++; j++; } } can be defaulted to
    :() (inout j := i&$*) { j++; j++; }, and formatted it is
    :() (inout j := i&$*) {
    j++;
    j++;
    };.
  • :() -> forward _ = { (inout j := i&$*) { j++; return j++; } } can be defaulted to
    :() -> forward _ (inout j := i&$*) { j++; return j++; }, and formatted it is
    :() -> forward _ (inout j := i&$*) {
    j++;
    return j++;
    };.

@JohelEGP
Copy link
Contributor

That formulation doesn't really work for copy parameters.
You make a copy of the actual capture each invocation.
That means that you can actually modify it (i.e., it isn't a member of the function expression object).
See https://cpp2.godbolt.org/z/Y7a3d5boE:

  {  f := :(x) -> _ = { (copy j := i$) return x + j++; };  assert(f(2) == 4);  assert(f(2) == 4);  }
  {  f := :(x)                                x + i$++;    _ = f;                                  }

@JohelEGP
Copy link
Contributor

Perhaps it'd be better to just allow function expressions to be parameterized just like blocks.
But those parameters appertain to the function expression object.
Note that "function" already implies a function parameter list.
A function expression is a value, and its parameters are part of that value.
So the parameters of a function (which might be an expression)
are different from the parameters of a function expression.
See https://cpp2.godbolt.org/z/PG4s7GxYY:

main: () = {
  i := 1;

  {  f := (copy j := i) :(x) x + j;  assert(f(2) == 3);  }
  {  f := (     j := i) :(x) x + j;  assert(f(2) == 3);  }

  {  f := (inout j := i) :()              = { j++;        j++; };         f();          assert(i == 3);  i = 1;  }
  {  f := (inout j := i) :() -> forward _ = { j++; return j++; };  assert(f()& == i&);  assert(i == 3);  i = 1;  }
}

@AbhinavK00
Copy link

This one is a big change, so I'd really like to hear what Herb has to say and what his rationale was behind the original decision along with if he thinks the change proposed here are worth it.

But since it's the syntax talk, I'll just propose something.
This is something from Hylo. To mark mutation in Hylo, the use a prefixed ampersand like &var.
So, a simple way to signal an inout capture could be something like var&$ or var$& to signify mutation or address. I know this isn't very far off from the current status quo but it can be explained in a different way.

Also, this idea was originally for marking inout call-sites but now the keyword inout is just used there, so I'll just throw it out there.
To mark inout arguments, use &

a : = 9;
f(x&, 8);  //first argmument is inout
//works very well with UFCS
x&.f(3);
//instead of the current
(inout x).f(7);

Making mutation explicit helps and I just think it's a great thing that Hylo has.

Note: As for the symbol used, either another symbol could be used (~ comes to mind) or cpp2 can have something different for taking address of a variable (maybe just recommend using std::addressof) or have something similar to sizeof (like addressof or addr) that just lowers to (preferably) std::addressof.

Sorry as this got off-topic but just wanted to throw it out there.

@JohelEGP
Copy link
Contributor

Or just take the hit of having one declaration per capture, with pointers for inout (https://cpp2.godbolt.org/z/qTbe5fEer):

main: () = {
  i := 1;
  _ = :() -> _ = {
    j := i&$;
    k := i$;
    return j*++ + k;
  };
}

But that is sub-optimal, as you make copies of the captures:

auto main() -> int{
  auto i {1};
  static_cast<void>([_0 = (&i), _1 = std::move(i)]() -> auto{
    auto j {_0};
    auto k {_1};
    return ++*cpp2::assert_not_null(std::move(j)) + std::move(k);
  });
}

A block parameter isn't better (https://cpp2.godbolt.org/z/36Esj66s9):

main: () = {
  i := 1;
  (j := i&)
    _ = :() -> _ = {
      k := i$;
      return j$*++ + k;
    };
}
auto main() -> int{
  auto i {1}; 
{
auto const& j = &i;

    static_cast<void>([_0 = std::move(i), _1 = j]() -> auto{
      auto k {_0}; 
      return ++*cpp2::assert_not_null(_1) + std::move(k); 
    });
}
}

I think this argues in favor of #833 (comment)
to just do the right, efficient thing.

@msadeqhe
Copy link

msadeqhe commented Nov 16, 2023

@JohelEGP I think having one declaration per capture as you have suggested, is the right way.

Cpp2 compiler can optimize and avoid the copy by making them a real alias.

In a short lambda body, the programmer can directly write &$* or $ in any expression:

var1: = 10;

call(: () = {
    print(var1&$* + var1&$*);
});

But in a long lambda body, the programmer can optionally make aliases to captures:

var1: = 10;

call(: () = {
    // We can use the same variable name `var1`, because it's in a different scope.
    var1: = var1&$*;

    print(var1 + var1);
});

Every var1 in the lambda body will be replaced (similar to a macro) with var1&$*, hence they are aliases.

If this optimization is not acceptable as the programmer expects var1 to have its own memory, a new syntax can be used to create aliases. But variable declaration with == is constepxr, therefore it cannot be used in this case. Is that right?

@msadeqhe
Copy link

msadeqhe commented Nov 16, 2023

For the alias syntax (in a way that the new name will be replaced with the old name, similar to a macro):

abc: namespace  alias = some::long::name;
v32: type       alias = std::vector<i32>;
fnc: (x) -> i32 alias = /* any expression */;
var: i32        alias = /* any expression */;

alias is a contextual keyword. alias declarations are similar to namespace, type, function and variable declarations, except when we use them, they will be replaced with their definition, similar to macro.

For example:

var1: = 10;
// The type is optional in variable alias declarations.
var2: alias = var1 + 10;

print(var2);
// It generates the following Cpp1 code:
// print(var1 + 10);

Or something similar to this approach, to avoid the copy when we declare a variable to capture.

@msadeqhe
Copy link

msadeqhe commented Nov 16, 2023

abc: namespace  alias = some::long::name;
v32: type       alias = std::vector<i32>;
fnc: (x) -> i32 alias = /* any expression */;
var: i32        alias = /* any expression */;

In this example, function aliases are simply Forced Inline Functions in terms of Cpp1.

And variable aliases are like function aliases without parameters.

@JohelEGP
Copy link
Contributor

@JohelEGP I think having one declaration per capture as you have suggested, is the right way.

Cpp2 compiler can optimize and avoid the copy by making them a real alias.

For some context, we're talking about these declarations j and k:

  i := 1;
  _ = :() -> _ = {
    j := i&$;
    k := i$;
    return j*++ + k;
  };

I did considered this, and thought of two things.

First, the need to prove that it doesn't make a performance difference.
It certainly does for k, which is initialized from the capture i$ each call.

Second, the compiler could recognize that up-front variables are captures.
But that's a semantic change without a change in syntax.
Moving the declaration of k after some non-declaration statement still copies each call.

That said, your comment made me think that the status-quo might be fine.
So today you can continue to write (https://cpp2.godbolt.org/z/aMKnz8PPE):

main: () = {
  i := 1;
  (k := i)
    _ = :() -> _ = {
      j := i&$;
      return j*++ + k$;
    };
  (k := i, j := i&)
    _ = :() -> _ = {
      return j$*++ + k$;
    };
}
  • Alias the variables before capturing them.
  • Rely on compiler optimizations for variable declarations in the body.

However, I still think (copy k := i, j := i&) :() is strictly superior (#833 (comment)):

  • SWYM.
  • No awkward distance between the block parameters and the function expression
    which might be necessary due to the shape of the code.
  • Only this one handles copies of complex expressions well.
    Unless you're OK with writing k := :() long*.expression()$; outside and k$() inside the function expression.

It looks to me that not having a dedicated place for declaring captures
can lead you to look for workarounds that might not work as you expect.
Specially for copies of complex expressions.
You might want to use those more than once in the body.
The only way to avoid copying twice is as with k$() above.

Copies become more important once you can have this parameters on function expressions (https://cpp1.godbolt.org/z/sPhTb9ahr).
So I think we might really need a solution like #833 (comment).

I certainly like the current status of capturing at the point of use.

@filipsajdak
Copy link
Contributor

My favorite grawlix idiom

@wolfseifert
Copy link
Author

Closing in favour of #247.

@JohelEGP
Copy link
Contributor

JohelEGP commented Dec 7, 2023

That formulation doesn't really work for copy parameters.
You make a copy of the actual capture each invocation.
-- #833 (comment).

This is no longer the case since commit 4bd0c04.
Though I have yet to update to use it, it seems like a promising change in conceptual model.

my conceptual model of captures is actually closer to them being additional local variables that are "static" (persistent across calls), and local variables are non-const by default.
-- 4bd0c04#commitcomment-134056070.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants