Handle interpreter stack overflow, and test runaway recursion. #103

sunfishcode · 2015-10-01T17:02:20Z

No description provided.

sunfishcode · 2015-10-01T22:01:59Z

It appears that while this gets a Stack_overflow exception on Linux, it segfaults on the Travis Mac build.

sunfishcode · 2015-10-02T17:32:09Z

I had added the inner try blocks to provide a more precise source location, but they aren't necessary. I removed them now.

This test is deliberately testing that implementations do not do opportunistic TCO, because this optimization is semantically observable (program traps reliably vs program executes reliably in constant space), so it could cause portability problems if some implementations do it and others don't, or don't in the same places. I've now added a comment about this.

I don't yet know what to do when OCaml segfaults instead of raising a Stack_overflow exception though. Is this an OCaml bug?

rossberg · 2015-10-05T12:26:21Z

Hm, some common optimisations amount ot (limited forms of) TCO, e.g., inlining. Not sure what the spec could say exactly to forbid only the general thing. Stack size is going to be implementation-dependent anyway, so I'm not even sure how valuable such an attempt would be.

Travis builds the native code version, which doesn't produce SO exceptions: http://caml.inria.fr/pub/docs/manual-ocaml/native.html#s%3Acompat-native-bytecode

lukewagner · 2015-10-05T15:21:16Z

Agreed that it would be hard for the spec to forbid TCO in the MVP. However, if we add stack-walking (something already we've discussed) and specify it deterministically, that may effectively rule out implicit TCO (so that if we want TCO, we'd specify an explicit tail-call op that would have a well-defined effect on the observable stack).

rossberg · 2015-10-05T16:04:45Z

Makes sense.

@sunfishcode: btw, I assume that the Travis failure is gone once you rebase on my Makefile change.

sunfishcode · 2015-10-05T16:09:16Z

All the spec needs to say is that implementations have some finite maximum limit and that every call monotonically

sunfishcode · 2015-10-05T16:12:55Z

Oops. All the spec needs to say is that implementations have some finite maximum limit and that every call monotonically reduces the distance to that limit. (func (call 0)) expecting a trap in the testsuite tests for this. While an implementation that does opportunistic TCO won't terminate and thus won't explicitly "fail" the test, it won't ever pass the test either.

lukewagner · 2015-10-05T16:22:20Z

@sunfishcode Well, the point @rossberg-chromium was making is that not every call will monotonically reduce the distance. Of course we don't care about every call, just the ones that form call cycles, but specifying that is harder. I agree it's good to simply have the test for now.

sunfishcode · 2015-10-05T16:28:05Z

@lukewagner An implementation can't do infinite inlining though, so it can still be thought of as having some finite theoretical limit to the call stack depth.

lukewagner · 2015-10-05T17:05:16Z

Good point. I guess we could model that in ml-proto by having an int counter inc'd by calls and then a host-supplied (in init) limit above which a trap is raised.

rossberg · 2015-10-05T17:23:32Z

I don't know, that would postulate that each implementation has an exact, deterministic limit, which still doesn't account for inlining. Inlining can increase the effective limit by arbitrary factors. Does it really make sense to spec upper bounds on computational resources? Because that is what this amounts to.

sunfishcode · 2015-10-05T17:55:10Z

It does account for inlining. Putting cycles aside (which is the point), there's only a finite amount of inlining an implementation can do in finite-sized code before there are no callsites left.

The practical case here is that we don't want opportunistic TCO to create a situation where apps run in some implementations and not others. Given how much some people want TCO, such a situation may even be fairly likely. I'm not aiming to just withhold TCO from people here -- I do expect that we'll put it in the spec in the future -- but we should do it together so that we can think through how it interacts with stack walking and other features.

sunfishcode · 2015-10-05T18:12:10Z

Also, there'd be no practical upper bound on computational resources. The limit itself would be purely theoretical in real-world implementations.

There is a limitation here, and it's that one can't write programs that depend on TCO until we add guaranteed TCO to the spec. I expect we'll do that in the future, but it's not part of the plan for the MVP.

lukewagner · 2015-10-05T18:43:24Z

I was having some of the same concerns @rossberg-chromium stated that specifying a single precise (though host-defined) limit would overly constrains impls. In particular, the concern is that user code could in theory first do a probe to find the host limit, and then, knowing that host limit, depend on its precise value for all future executions (which of course would break given the way we want to impl overflow checking). The way I rationalized away this concern is that, as long as we:

still allow for stack overflow when the host limit is not yet hit (as is already the case in Nondeterminism.md)
don't allow these arbitrary overflows to be discriminated from host limit overflows

then an impl would always be free to fault whenever it wanted. The formal limit would exist only to state ∃ some finite limit.

titzer · 2015-10-05T20:25:16Z

This is a resource exhaustion question akin to memory exhaustion (i.e. heap
overflow). If we were to think about how one would spec heap overflow,
which isn't easy, then I think we are going to run into endless problems
like "how big is an object" and "when does an object become unreachable".
The analogous questions would be "how big is a frame" and "when is a frame
unreachable". The problem is that frame sizes aren't fixed (e.g. dynamic
optimization) and when they become unreachable is also not precise.

I don't think we can be precise about when stack overflow occurs, but
only if stack overflow occurs. Even for that we need to ban implicit TCO
(which makes some recursive programs no longer trigger overflow), requiring
us to introduce an explicit tail call form that doesn't consume stack space.

For stack introspection, we'll need to ban implicit TCO, because
introspection would make TCO observable.

I can see a couple of uses of stack introspection, such as creation of
source-level stack traces and user-controlled on-stack-replacement in a JIT
scenario. Those are important to not rule out from the start.

So maybe it's enough to ban implicit TCO and state that programs that
infinitely recurse will eventually exhaust stack space? Our tests for
stack overflow can be simple recurse-forever programs and avoid having
specific limits for which to test.

Making the checking deterministic--especially across VMs--will likely be a
nightmarish implementation burden.

On Mon, Oct 5, 2015 at 8:43 PM, Luke Wagner notifications@github.com
wrote:

I was having some of the same concerns @rossberg-chromium
https://github.com/rossberg-chromium stated that specifying a single
precise (though host-defined) limit would overly constrains impls. In
particular, the concern is that user code could in theory first do a probe
to find the host limit, and then, knowing that host limit, depend on its
precise value for all future executions (which of course would break given
the way we want to impl overflow checking). The way I rationalized away
this concern is that, as long as we:

still allow for stack overflow when the host limit is not yet hit
(as is already the case in Nondeterminism.md)

don't allow these arbitrary overflows to be discriminated from host
limit overflows

then an impl would always be free to fault whenever it wanted. The formal
limit would exist only to state ∃ some finite limit.

—
Reply to this email directly or view it on GitHub
#103 (comment).

This corrsponds with WebAssembly/spec#103

sunfishcode · 2015-10-05T20:48:49Z

@titzer I agree, and that's exactly what my PR here does :-).

It currently just reports stack overflow whenever the underlying OCaml process hits stack overflow, which leaves room for improvement, but it's a good first step.

@rossberg-chromium I rebased on your Makefile change and the Travis failure is now fixed. Thanks!

Also, I've now created WebAssembly/design#387 which adds a paragraph about this issue to the design.

titzer · 2015-10-05T21:04:09Z

On Mon, Oct 5, 2015 at 10:48 PM, Dan Gohman notifications@github.com
wrote:

@titzer https://github.com/titzer I agree, and that's exactly what my
PR here does :-).

Yeah, I know, so ship it :-)

It currently just reports stack overflow whenever the underlying OCaml
process hits stack overflow, which leaves room for improvement, but it's a
good first step.

@rossberg-chromium https://github.com/rossberg-chromium I rebased on your
Makefile change and the Travis failure is now fixed. Thanks!

Also, I've now created WebAssembly/design#387
WebAssembly/design#387 which adds a paragraph
about this issue to the design.

—
Reply to this email directly or view it on GitHub
#103 (comment).

This corresponds with WebAssembly/spec#103

ghost · 2015-10-05T23:26:14Z

I disagree with the spec stating that 'Implementations are not permitted to do implicit tail-call optimizations'. It should just be implementation dependant, and this is enough of a protection against code being written to depend on TCO.

I also disagree with the spec stating that 'every call must take up some resources toward exhausting that size'.

Some compilers might naturally use TCO and might not consume stack. Some archs pass the return address in a register and might have adequate registers for a particular function so that they do not not need to use the stack on some functions, and they should not be hobbled.

By defining this area it creates the same problem it was proposing to solve. Programs might now be written to depend on stack exhaustion! But it's not a big flaw in the spec, and implementations can just ignore it, and code will need to be written to not depend on stack exhaustion, but why not just note this in the spec.

titzer · 2015-10-05T23:31:35Z

I agree with what JF suggested. We don't want to ban a particular
optimization, just transforming a program that uses infinite resources into
one that uses finite resources. That's an observability issue. When we get
into stack observations, we'll have to limit TCO in implementations to
comply with stack observation requirements. Then it will be useful to have
a specific tail call opcode that will require the optimization and also not
be stack-observable.

On Tue, Oct 6, 2015 at 1:26 AM, JSStats notifications@github.com wrote:

I disagree with the spec stating that 'Implementations are not permitted
to do implicit tail-call optimizations'. It should just be implementation
dependant, and this is enough of a protection against code being written to
depend on TCO.

I also disagree with the spec stating that 'every call must take up some
resources toward exhausting that size'.

Some compilers might naturally use TCO and might not consume stack. Some
archs pass the return address in a register and might have adequate
registers for a particular function so that they do not not need to use the
stack on some functions, and they should not be hobbled.

By defining this area it creates the same problem it was proposing to
solve. Programs might now be written to depend on stack exhaustion! But
it's not a big flaw in the spec, and implementations can just ignore it,
and code will need to be written to not depend on stack exhaustion, but why
not just note this in the spec.

—
Reply to this email directly or view it on GitHub
#103 (comment).

ghost · 2015-10-05T23:50:34Z

@titzer Stack walking is not defined yet, and may well be implementation dependant too wrt call optimizations. Perhaps it would be better if stack exhaustion was not even observable to the wasm code, that the code would terminate in a manner not different to the user terminating the program, then there is no 'observability issue'.

titzer · 2015-10-05T23:59:01Z

On Tue, Oct 6, 2015 at 1:50 AM, JSStats notifications@github.com wrote:

@titzer https://github.com/titzer Stack walking is not defined yet, and
may well be implementation dependant too wrt call optimizations.

When it is defined, it's important that we spec it so that call
optimizations (e.g. inlining) are not observable.

Perhaps it would be better if stack exhaustion was not even observable to
the wasm code, that the code would terminate in a manner not different to
the user terminating the program, then there is no 'observability issue'.

—
Reply to this email directly or view it on GitHub
#103 (comment).

ghost · 2015-10-06T00:14:35Z

@titzer I don't think it will be possible to spec stack walking so that call optimizations are not observable anyway, not without performance and accounting burden. Has this even been done?

titzer · 2015-10-06T00:58:07Z

On Tue, Oct 6, 2015 at 2:14 AM, JSStats notifications@github.com wrote:

@titzer https://github.com/titzer I don't think it will be possible to
spec stack walking so that call optimizations are not observable anyway,
not without performance and accounting burden. Has this even been done?

JVMs, CLR VMs, and JSVMs all do this.

—
Reply to this email directly or view it on GitHub
#103 (comment).

ghost · 2015-10-06T01:42:24Z

@titzer That is interesting, but I would like to understand how, to understand any tradeoffs. Could you point me to a JSVM that does this? That is a JSVM that implements functions with no stack usage, keeping the return address in a register, while still allowing recognition of the functions frame when stack walking? Show me a JSVM that implements TCO while still allowing the frame to be recognized?

ghost · 2015-10-06T02:03:09Z

@titzer Sorry, I think I misunderstood the discussion. There seems to be agreement that 'call optimizations (e.g. inlining) are not observable' in stack walking, which is what I was (trying) to communicate support for (that they might not be observable), so this seems fine. But then how is that consistent with the patch being discussed here?

jfbastien · 2015-10-06T04:13:07Z

The current quibble is just about tail call optimization being observable when you walk the stack. With stack inspection you'll see a single instance of the function that was tail called, but not its recursion depth.

You could spec that such cases show you the function at least once (and maybe more if not optimized) but then there ends up being interesting corner cases with more than just one recursive function that require extra bookkeeping, or just make it hard to do the optimization. It's not impossible, but then you stack inspection becomes not-so-trivial.

You could also spec that stack walks are mostly precise, but then we don't have full insight into what developers want it for so the loss of precision may be undesirable (or not!).

rossberg · 2015-10-06T06:29:34Z

On 5 October 2015 at 22:25, titzer notifications@github.com wrote:

I don't think we can be precise about when stack overflow occurs, but
only if stack overflow occurs.

Sure. I think the main discussion here is if and how that could be done at
the formal spec level without actually overspecifying.

Even for that we need to ban implicit TCO

Same problem: how do you specify that without overspecifying? Sure it must
still be fine to eliminate some calls, a.k.a. inlining.

I remain skeptical, despite Luke's argument. IMO, it makes more sense at
this stage to have that as an informal requirement at best, and avoid
trying to make it precise in the formal spec.

lukewagner · 2015-10-06T15:28:59Z

@rossberg-chromium Agreed that we don't need to put a requirement in the spec for now; just tests like what's in this PR should hold down the fort until eventually we get to stack inspection and then we'll have to answer the hard questions.

sunfishcode · 2015-10-06T18:20:04Z

I updated the patch to only handle Stack_overflow on the Eval.invoke path; if there's a stack overflow on the Eval.host_eval path, that's something different. And, I reworded the comment in the testcase to reflect the changes in the wording made in WebAssembly/design#387 .

sunfishcode · 2015-10-08T15:34:09Z

This needn't be the last word on this topic, but it's useful to have this test in place as guidance for implementation developers.

Handle interpreter stack overflow, and test runaway recursion.

In WebAssembly#90 it was decided to move the event section between memory section and global section. This change is reflected in the next paragraph, but not in the introduction.

Update the encodings for ref.as_non_null, br_on_null, (ref ht), and (ref null ht) for consistency with the final encodings chosen in WebAssembly/gc#372. Fixes WebAssembly#103.

sunfishcode force-pushed the runaway-recursion branch from 712e6c8 to 1cfe99c Compare October 2, 2015 17:24

sunfishcode closed this Oct 5, 2015

sunfishcode reopened this Oct 5, 2015

sunfishcode force-pushed the runaway-recursion branch from 1cfe99c to 4b25268 Compare October 5, 2015 17:03

sunfishcode added a commit to WebAssembly/design that referenced this pull request Oct 5, 2015

Add a paragraph describing the prohibition on implicit TCO.

002ee6f

This corrsponds with WebAssembly/spec#103

sunfishcode mentioned this pull request Oct 5, 2015

Add a paragraph describing the prohibition on implicit TCO. WebAssembly/design#387

Merged

sunfishcode added a commit to WebAssembly/design that referenced this pull request Oct 5, 2015

Add a paragraph describing the prohibition on implicit TCO.

ee3b793

This corresponds with WebAssembly/spec#103

sunfishcode force-pushed the runaway-recursion branch from 4b25268 to 9b051cd Compare October 6, 2015 18:08

sunfishcode added 2 commits October 7, 2015 12:07

Handle interpreter stack overflow, and test runaway recursion.

9d8e3d7

Add an infinite recursion test to fac.wast too.

45673f5

sunfishcode force-pushed the runaway-recursion branch from 9b051cd to 45673f5 Compare October 7, 2015 19:07

sunfishcode added a commit that referenced this pull request Oct 8, 2015

Merge pull request #103 from WebAssembly/runaway-recursion

a39ea85

Handle interpreter stack overflow, and test runaway recursion.

sunfishcode merged commit a39ea85 into master Oct 8, 2015

sunfishcode deleted the runaway-recursion branch October 8, 2015 15:34

eqrion pushed a commit to eqrion/wasm-spec that referenced this pull request Jul 18, 2019

[interp] Remove overlap check from *.copy instr (WebAssembly#103)

07c57f1

rossberg pushed a commit that referenced this pull request Feb 11, 2021

Fix links to table.fill's validation and execution in appendix (#103)

401c8eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle interpreter stack overflow, and test runaway recursion. #103

Handle interpreter stack overflow, and test runaway recursion. #103

sunfishcode commented Oct 1, 2015

sunfishcode commented Oct 1, 2015

sunfishcode commented Oct 2, 2015

rossberg commented Oct 5, 2015

lukewagner commented Oct 5, 2015

rossberg commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

lukewagner commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

lukewagner commented Oct 5, 2015

rossberg commented Oct 5, 2015 via email

sunfishcode commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

lukewagner commented Oct 5, 2015

titzer commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

titzer commented Oct 5, 2015

ghost commented Oct 5, 2015

titzer commented Oct 5, 2015

ghost commented Oct 5, 2015

titzer commented Oct 5, 2015

ghost commented Oct 6, 2015

titzer commented Oct 6, 2015

ghost commented Oct 6, 2015

ghost commented Oct 6, 2015

jfbastien commented Oct 6, 2015

rossberg commented Oct 6, 2015

lukewagner commented Oct 6, 2015

sunfishcode commented Oct 6, 2015

sunfishcode commented Oct 8, 2015

Handle interpreter stack overflow, and test runaway recursion. #103

Handle interpreter stack overflow, and test runaway recursion. #103

Conversation

sunfishcode commented Oct 1, 2015

sunfishcode commented Oct 1, 2015

sunfishcode commented Oct 2, 2015

rossberg commented Oct 5, 2015

lukewagner commented Oct 5, 2015

rossberg commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

lukewagner commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

lukewagner commented Oct 5, 2015

rossberg commented Oct 5, 2015 via email

sunfishcode commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

lukewagner commented Oct 5, 2015

titzer commented Oct 5, 2015

sunfishcode commented Oct 5, 2015

titzer commented Oct 5, 2015

ghost commented Oct 5, 2015

titzer commented Oct 5, 2015

ghost commented Oct 5, 2015

titzer commented Oct 5, 2015

ghost commented Oct 6, 2015

titzer commented Oct 6, 2015

ghost commented Oct 6, 2015

ghost commented Oct 6, 2015

jfbastien commented Oct 6, 2015

rossberg commented Oct 6, 2015

lukewagner commented Oct 6, 2015

sunfishcode commented Oct 6, 2015

sunfishcode commented Oct 8, 2015