-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eliminate stack overflow #1639
Comments
With the new coroutine semantics, what reason is there to prefix calls with |
It would not be strictly necessary. However it makes it obvious that the function call does not return the return type of the function but instead returns the coroutine frame. It would also be possible for async functions to call other async functions without any special syntax at all; the question becomes, is it better to force the suspend point to be explicitly pointed out, or not? |
It should be a new attribute: |
This would now (since #3856) be a new option to |
What would we use as the stack size for vdso functions? If the kernel doesn't already, we could propose that they include a valid |
would it be required to heap allocate stack frames of functions that are tail call optimized/stack size is bounded at comptime? |
Re. post-comptime: I think this is an idea worth exploring on its own. In trying to work out how to better optimise struct layout, I've stumbled upon another use case; see #7888. |
I thought you guys might want to know there is a huge discussion about automatic transformation of recursive algos to heap allocated rather than stack allocated, on Hacker News. I think Zig, being so progressive, and being more of a "safe" language than C could benefit by fixing the stack overflow flaw in C. https://news.ycombinator.com/item?id=28684098 Cheers @andrewrk . Thanks for pioneering Zig |
The syntax for recursive functions via coroutines seems both confusing and extremely annoying to write, making recursion incredibly tedious to use. Using async/await to write recursive functions seems very confusing (and would also hamper performance). The minimal idea I have to make it better would be to have a builtin alternative to async fn fib(allocator: *Allocator, x: i32) !i32 {
if (x <= 1) return x;
return try @alloc_call(allocator, fib, x - 1) + try @alloc_call(allocator, fib, x - 2);
} It would still be messy, have the overhead of heap allocation, and mark the function as async for some reason even though it practically isn't, but at least it would provide a nice way to write it. My alternative suggestion, however, is to provide some form of explicit stack recursion support. If it's okay for the function to say it's out of memory, then it's okay for the function to fail because it hit the stack limit and return I have two main ideas for the syntax for this, both with their tradeoffs. One is a rec fn fib(x: i32) !i32 {
if (x <= 1) return x;
return try rec fib(x - 1) + try rec fib(x - 2)
} From an implementation perspective, I think An alternative option is to move the burden of dynamically allocating a stack frame inside the recursive function. This might remove the need for a fn fib(x: i32) !i32 {
try @build_next_frame();
if (x <= 1) return x;
return try fib(x - 1) + try fib(x - 2);
} The fn example(x: i32) i32 {
// do stuff
if (x = 0) {
@build_next_frame() catch @panic("");
const y: [100000]i32 = undefined;
}
} The obvious disadvantage over the previous suggestion is that a fn fib(x: i32) !i32 {
try @with_dynamic_frame {
if (x <= 1) return x;
return try fib(x - 1) + try fib(x - 2);
};
} This variant would probably need to be a keyword (like The Either way, even without static analysis like that, I think either of these is more readable, easier to write, and more performant than coroutine-based recursion on the heap, while being exactly as safe. It would also I think be reasonable to supplement this with an fn fact(x: i32) i32 {
fn helper(x: i32, acc: i32) i32 {
if (x <= 1) return acc;
return @tailcall(fact, x - 1, acc * x);
}
return helper(x, 1);
} Would compile just fine, because it's a literal tail call and trivially optimizable to a loop while fn fact(x: i32) i32 {
if (x <= 1) return 1;
return @tailcall(fact, x - 1) * x;
} would not be supported in a most basic version because an accumulator variable would need to be added by the compiler. That said, it would be nice to try to support some stuff like the latter, it seems like a rather low priority. Either way, the crux would be that it lets you write a "recursive-like" function which uses a compile time known stack space. The final issue I see is that the current handling of extern functions would make it extremely difficult to work with C interop, while also not actually being safe anyway. One of the big draws of Zig is being able to automatically generate C bindings, but this would make it impossible to call a C function on any system where the stack size is limited to the architecture default or less, I believe? I'm not sure what happens if you run an executable that asks for a specific stack size on a system with ulimits set below what it asks for. Providing accurate stack size annotations to C functions would be extremely difficult. On the flip side, the default stack size is not actually guaranteed to be safe for C functions anyway, because those can have unchecked recursion whose depth depends on the arguments. It's unclear to me how best to handle this because you can't actually check a non-cooperative function's stack usage. At best, you can protect the page at the end of the stack and set a signal handler to make a stack overflow runtime checked undefined behavior instead of just "undefined behavior, good luck." Unlike with the recursive functions, I wasn't able to come up with anything that feels like a good solution, but it felt worth noting that extern functions, since they can be recursive, are still inherently unsafe even if you allocate the amount of stack size they're supposed to run within. |
Something similar to this^. Everything should be gucci if you are able to tell a function how many times it can loop around itself, and if it goes beyond that then throw a runtime error that we can check against. The compiler should guarantee that there is enough stack allocated for any defered operations so that the developer defined recursing index can be reached without a stack overflow. This is to mimic what we could be doing manually, i.e allocate arrays of maxNestSize for each required variable in the function, add an extra array of maxNestSize to store identifiers of different branches of entry with different deferred operations, confirm that the arrays don't overflow, and create a reversed loop to process the defferred operations, but in some cases it feels clunky, so would be nice if the compiler were finally able to automize the fix for the out of date design of deferred operations. |
I was having exactly the same idea. Furthermore, maybe this would allow TBH I'm rather sceptical about the feasibility of static stack analysis. There are definitely cases where it could work, but I doubt it can cover a sufficiently wide spectrum of situations in a reasonable/usable way, so it'd be nice to be able to not count on it. |
segmented stacks/stack splitting seems like a simpler approach to the recursion issue vs converting to a coroutine and then doing heap allocations. |
This is a proposal to solve #157.
First we must introduce a new kind of value, one that is neither
comptime
nor runtime.It is a value that is known at compile-time, but only after all semantic analysis is complete,
and so it cannot participate in
comptime
code. For the purposes of this proposal,this kind of value is called a "post-comptime" value. An actual comptime value can be used
where a post-comptime value is expected. Simple arithmetic such as
+
,-
, and*
can beused on post-comptime values, to produce another post-comptime value. There also will be
some way to do a
max()
operation on post-comptime values.External Functions
extern
function declarations have a default stack size requirement,which is dependent on the target. For example, on x86_64 Linux,
the default is
8388608 bytes
(8 MiB). This matches the default stacksize on Linux for executables as well as default pthread stack sizes.
Zig treats every
extern
function as if it needs at least this muchstack space.
When an external function is known to require a larger or smaller amount
of stack space, it can be annotated:
Here,
foo
is known to require only 912 bytes of stack space. In this example,we provided a comptime value, but the expression can be a post-comptime value.
Function Pointers
One does not simply call a function pointer.
One must use
@ptrCall(worst_case_stack_size, function, ...)
, whereworst_case_stack_size
is a post-comptime value.This inserts a runtime safety check - once compilation completes,
all function stack size requirements are known, and inserted into
the binary. A function pointer call checks if the provided
worst_case_stack_size
value is correct.This ensures that during semantic analysis, the compiler can rely on
this value when calculating worst case stack size.
Most APIs will want to accept
comptime
functions rather than functionpointers, as this avoids
@ptrCall
. For example,std.mem.Allocator
,std.io.InStream
,std.io.OutStream
, andstd.rand.Random
.Recursion
Recursion, whether direct or indirect will be solved with the coroutines rewrite.
See #1260.
Call graph analysis detects loops:
This will be fixed with coroutines:
Calls to this function grow heap memory, not stack memory. The function can
return
error.OutOfMemory
. This is a leaf node in the call graph analysis;it does not store any information on the stack.
Annotating Additional Stack Usage (Inline Assembly)
In some cases, it may be necessary to annotate that more stack is used, even
when it does not come from a function call. One such example is inline
assembly that directly accesses the stack.
For these cases there will be a function
@declareStackUsage(byte_count)
, wherebyte_count
is a post-comptime value.Because Zig is not able to trace how this builtin call moves with LLVM's function
inlining optimization passes, the value is globally added to the root node of
the call graph.
Threads
Currently in
std.os.spawnThread
, Zig allocates a hard coded 8 MiB per thread.With this proposal, it will use a new builtin function
@stackSize(function)
.This builtin returns a post-comptime number of bytes of the worst case stack
size required to call
function
.Main Function
Zig knows the start symbol for the target. This is how it avoids including
std/special/bootstrap.zig
when a program manually exports the start symbol.During semantic analysis, Zig uses
@stackSize(start_function)
internallyto determine stack size requirements, and passes this to the linker.
Exported Functions
Zig internally runs
@stackSize
on all exported functions, in order torun the call graph analysis and emit compile errors for recursion, etc.
The computed stack size requirements are included in generated .h files,
in comments, since C has no way to annotate stack size requirements for
functions.
The text was updated successfully, but these errors were encountered: