-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: return data from BPF programs #19318
Conversation
docs/src/proposals/return-data.md
Outdated
|
||
## Proposed Solution | ||
|
||
The callee can set the return data using a new system call `sol_returndata(u8 *buf, u64 length)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@armaniferrante / @jstarry - would be great if you have some time to review this proposal, it should be quick!
Co-authored-by: Michael Vines <mvines@gmail.com>
Co-authored-by: Michael Vines <mvines@gmail.com>
Co-authored-by: Michael Vines <mvines@gmail.com>
docs/src/proposals/return-data.md
Outdated
uint64_t *return_data_length, | ||
); | ||
``` | ||
On entry, `return_data_length` should point to the size of the buffer at `return_data`. If the callee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the size of the return data impact compute unit consumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than overloading invoke, why not break this out into a separate syscall that retrieves the return data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overloading invoke saves the program making two syscalls (invoke & get_return_data) rather than just invoke.
Maybe the separate syscall makes more sense. Happy to move on that and amend the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the trade-off. I'm leaning toward separating the two so that the two API's don't get entwined. Especially if we change invoke, we might need to create two new invokes. But the advantage of a single syscall isn't negligible so I could be swayed either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes, we have a common way of charging for data passed in and out of the program, these bytes should incur a similar cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that now the return data is re-used from the invoke/callee return data and the current program return data. This means a) Single buffer required, rather than two b) Callee return data is passed up the stack, which is very useful proxy programs
Co-authored-by: Justin Starry <justin.m.starry@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about this promotion or being completely?
Signed-off-by: Sean Young <sean@mess.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of idea means if someone to missed
The proposal didn't succeed it's means there is a lot of pending of what did done for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest iteration looks great to me, nice and simple
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
When an instruction calls `sol_invoke()`, the return data of the callee is copied into the return data | ||
of the current instruction. This means that any return data is automatically passed up the call stack, | ||
to the callee of the current instruction (or the RPC call). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm coming in a bit late here, but here's an idea, take it or leave it. Would it make sense to also include the program id that set the return data?
As you say here, if you have program A which calls program B which calls program C, ie. A -> B -> C, program C can set a return value, B can use that return value but not set a return value itself, then A can also read C's return value. As a safety check, A may want to know who set the return value, B or C.
If program A relies on C's return value, for example to calculate how many tokens to transfer, then a malicious program B could update itself to change that value, and steal funds. It's a bit far-fetched, but at least we'd give devs the tools to program defensively if they can check which program set the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a really good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanyoung - wdyt about providing the program id that set the return data in sol_get_return_data()
? This'll allow the caller to add a defensive check like what Jon mentioned if they want.
Something like sol_get_return_data(u8 mut *buf, u64 length, Pubkey mut *from) -> u64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree this is a good idea. I've changed the syscall as suggested. Thanks @joncinque !
- Compute costs are done for the syscalls, not for passing - return data is cleared by sol_invoke
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope it works for us
docs/src/proposals/return-data.md
Outdated
of the current instruction. This means that any return data is automatically passed up the call stack, | ||
to the callee of the current instruction (or the RPC call). | ||
|
||
Note that `sol_invoke()` clears the returns data, so that any return data from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate this, A -> B -> C, does this mean that if B sets return data it would be cleared when invoke
ing C? So if B sets return data, calls C, C does not set return data, then A will not receive B's return value?
This behavior should be well documented next to the sol_set_return_data
system call so folks have a clear understanding of what to expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've expanded on this.
16f5a9e
to
65564fb
Compare
Merged via #19548 |
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
Unfortunately, I saw it just now that it was merged, and I have some concerns:
Instead, as it needs to be implemented in the new ABI (see #19191) anyway, I think that reserving a shared memory area there would be a better approach. |
@Lichtso I should have added you as a reviewer, sorry about that. I think your criticisms are all valid. Your proposed ABI requires re-building BPF programs anyway, how about if we removed the syscall in ABI v2. |
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
Requires: - solana-labs/solana#19548 - solana-labs/solana#19318 Signed-off-by: Sean Young <sean@mess.org>
See docs/src/proposals/return-data.md in PR for details.