Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert everything #14

Merged
merged 3 commits into from
Aug 25, 2015
Merged

Assert everything #14

merged 3 commits into from
Aug 25, 2015

Conversation

lukewagner
Copy link
Member

This patch switches all the .wasm tests to asserteq instead of simply invoking.

@lukewagner
Copy link
Member Author

With this, everything is asserted which makes it nice to run all the tests (src/wasm test/*.ml) and have silence mean "success". For this reason, I added a second commit to flip the default of Flags.print_sig to false. There is still 1 value printed by one remaining Invoke which actually fails if I try to asserteq it :) That's for a separate PR, though.

@rossberg
Copy link
Member

Hm, this seems to break kg's recent test runner change, which checks the output of tests.

We probably need to agree on one mechanism for verifying tests. Should we use internal asserts or should we use external output validation? I'm fine either way, but two competing (and incompatible) approaches are a problem.

@kg, WDYT?

@kg
Copy link
Contributor

kg commented Aug 19, 2015

I'm fine with internal assertions for general use but they don't cover all scenarios. I discussed this with luke on IRC. For things like testing trap behavior on failure cases, validation, etc we will need to have an external test harness that checks the result of running the interpreter.

We could write all this testing infrastructure in ocaml and wrap it around the prototype, but that introduces some nasty ecosystem issues I think we should avoid. (I'm not particularly attached to python unittest, but we definitely want to use an existing production-quality testing framework.)

Luke pointed out that he wanted the python test runner to be quiet, also, which I agree with. I can make some minor changes to do that today.

@rossberg
Copy link
Member

On 19 August 2015 at 16:26, Katelyn Gadd notifications@github.com wrote:

I'm fine with internal assertions for general use but they don't cover all
scenarios. I discussed this with luke on IRC. For things like testing trap
behavior on failure cases, validation, etc we will need to have an external
test harness that checks the result of running the interpreter.

Makes sense. The change probably still needs to adjust the output
expectations, though.

We could write all this testing infrastructure in ocaml and wrap it around
the prototype, but that introduces some nasty ecosystem issues I think we
should avoid. (I'm not particularly attached to python unittest, but we
definitely want to use an existing production-quality testing framework.)

Yes, I don't think it's necessary to do that in Ocaml. I actually prefer to
have text as a solid abstraction boundary. :)

Luke pointed out that he wanted the python test runner to be quiet, also,
which I agree with. I can make some minor changes to do that today.

Do you think it's also possible to modify the runner such that it doesn't
invoke ocamlbuild by default? For the make users among us (ocamlbuild
rejects build artefacts in src, so I always have to make distclean first).

@kg
Copy link
Contributor

kg commented Aug 19, 2015

Absolutely. If you want I can take that out. I wanted to avoid the scenario where somebody makes a bunch of changes, runs tests, sees them pass (because they forgot to build) and checks in. But we can solve that with CI later if it happens a lot.

@lukewagner
Copy link
Member Author

@kg All of those conditions (traps, validation failures, etc) can be tested within .wasm files by adding new testing primitives to the scripting language. The only use case I see for external tool validation is testing the scripting primitives themselves (making sure they fail properly). For reference, this is what we've done in the SM test suite by adding shell-only builtins; the test harness just uses the process result code to determine pass/fail.

@lukewagner
Copy link
Member Author

I meant to give motivation for internal assertions:

  • single file
  • if you update a test, you don't have to update the output file
  • more flexible if we want to extend the command language (e.g., in SM we'll often exhaustively test a cartesian product of possibilities by running a test in a loop, instead of manually duplicating all the cases)

@lukewagner
Copy link
Member Author

As mentioned in #17, I'd like to write some negative-validation tests and I think these tests will be more robust and easier to write by adding a new inline assertion rather than matching the error-string output (which will change over time and require a new file for each individual case I want to test). Can we agree to go the inline-test route?

@rossberg
Copy link
Member

I'm fine either way. @kg?

But can you please adjust the existing test expectations to this change?

@kg
Copy link
Contributor

kg commented Aug 21, 2015

I'm OK with basic assertions in the wasm but I stand firmly by my opinion that we should not put complex test logic in the wasm interpreter, at least not yet. Tests inevitably need to do things like substring matches on error messages, handle traps, handle out-of-memory conditions, etc.

Once wasm itself is expressive and robust enough to implement all these tests correctly (in a way that doesn't produce silent failure, undebuggable hangs or undebuggable crashes) it will make sense to move everything out of runtests.py into some wasm code and self-host. But not today.

@titzer
Copy link
Contributor

titzer commented Aug 21, 2015

I think declarative inputs to tests are best. E.g. main(0, 1) = 3,
main(1,0)=!trap. Otherwise you will end up with feature madness in the test
runner and end up with a complex DSL just to run anthing; it's a nightmare
trying to debug the DSL that generates inputs to a test. We should favor
tests that have most a few dozen input cases.

One example:
https://code.google.com/p/virgil/source/browse/test/execute/alloc_array00.v3

And also negative validation tests and trapping tests can easily be
expressed in a single line like:
https://code.google.com/p/virgil/source/browse/test/seman/top_def05.v3

On Fri, Aug 21, 2015 at 8:11 PM, Katelyn Gadd notifications@github.com
wrote:

I'm OK with basic assertions in the wasm but I stand firmly by my opinion
that we should not put complex test logic in the wasm interpreter, at least
not yet. Tests inevitably need to do things like substring matches on error
messages, handle traps, handle out-of-memory conditions, etc.

Once wasm itself is expressive and robust enough to implement all these
tests correctly (in a way that doesn't produce silent failure, undebuggable
hangs or undebuggable crashes) it will make sense to move everything out of
runtests.py into some wasm code and self-host. But not today.


Reply to this email directly or view it on GitHub
#14 (comment).

@lukewagner
Copy link
Member Author

@kg I don't see what "complex test logic" you're referring to; what I'm talking about is just more basic Script.commands to assert things like negative validation and faulting. Also, none of this logic is in the wasm interpreter (check.ml, eval.ml), it's in the script harness (script.ml).

@titzer I don't think anyone is suggesting a DSL here. The current tiny set Script.commands extended with commands to assert non-validation and faulting should be basically equivalent to the virgil tests you linked to (just different syntax and you can have a sequence of them). Or maybe you're already agreeing?

@titzer
Copy link
Contributor

titzer commented Aug 22, 2015

On Sat, Aug 22, 2015 at 2:08 AM, Luke Wagner notifications@github.com
wrote:

@kg https://github.com/kg I don't see what "complex test logic" you're
referring to; what I'm talking about is just more basic Script.commands
to assert things like negative validation and faulting. Also, none of this
logic is in the wasm interpreter (check.ml, eval.ml), it's in the
script harness (script.ml).

@titzer https://github.com/titzer I don't think anyone is suggesting a
DSL here. The current tiny set Script.command
https://github.com/WebAssembly/spec/blob/master/ml-proto/src/script.mli#L6s
extended with commands to assert non-validation and faulting should be
basically equivalent to the virgil tests you linked to (just different
syntax and you can have a sequence of them). Or maybe you're already
agreeing?

Having script commands (e.g. assert) is not declarative, and yes, scripts
are a DSL, embedded in s-expressions. Tests should not be able to control
their own assertions for exactly the same reason that benchmarks should not
time themselves. Test should not have to generate their own inputs with set
up code; instead, they should specify inputs and expected results.


Reply to this email directly or view it on GitHub
#14 (comment).

@rossberg
Copy link
Member

@titzer, I'm not sure I get it. AFAIAC, the assert "command" here is declarative. And you seem to have a similar mini-assert-DSL in your Virgil tests, it is just hiding in comments. That looks pretty equivalent to me.

@titzer
Copy link
Contributor

titzer commented Aug 23, 2015

In the grammar it allows an "expr_list" as an argument to the assert eq.
That allows arbitrary computation, which is not at all declarative.

I am advocating that only inputs and outputs can be specified--i.e. no
scripting. Inputs should always be values, and outputs are either values,
validation error(s), or a trap.

On Sat, Aug 22, 2015 at 10:40 AM, rossberg-chromium <
notifications@github.com> wrote:

@titzer https://github.com/titzer, I'm not sure I get it. AFAIAC, the
assert "commands" here are declarative. And you seem to have a similar
micro-assert-DSL in your Virgil test, it is just hiding in comments. That
looks pretty equivalent to me.


Reply to this email directly or view it on GitHub
#14 (comment).

@lukewagner
Copy link
Member Author

@titzer I expect exprs were just done for expedience. All the current uses just pass literals and it'd be just as easy to restrict it to a literal_list.

@lukewagner
Copy link
Member Author

Anyhow, this PR doesn't take away any testing functionality, it just makes all the invokes assert their results; that shouldn't be too controversial. I'll make separate PRs to add faulting and negative validation commands.

@lukewagner lukewagner mentioned this pull request Aug 25, 2015
@rossberg
Copy link
Member

LGTM. Btw, was there a specific reason you left the "cast" invocation alone?

rossberg added a commit that referenced this pull request Aug 25, 2015
@rossberg rossberg merged commit c2afa23 into master Aug 25, 2015
@lukewagner
Copy link
Member Author

@rossberg-chromium Yes, because if I make it asserteq (and fix the definition of float in lexer.mll), the assertion fails even though the float_to_string values are the same. Int64.of_float confirms that the least-significant bits are different, so I was guessing this was just imprecision in float_to_string, but I meant to ask in a follow-up.

@rossberg
Copy link
Member

rossberg commented Aug 25, 2015 via email

@rossberg rossberg deleted the assert-everything branch August 26, 2015 13:15
littledan pushed a commit to littledan/spec that referenced this pull request Mar 4, 2018
* WIP on writing up alternate encoding

* A little more work on the alternate proposal

* Add encoding proposal that uses sign-extension operator

* Update Overview.md

* Fix table
eqrion pushed a commit to eqrion/wasm-spec that referenced this pull request Jul 18, 2019
Using `memory.init` or `memory.drop` on an active segment is a validation error, not a trap.
ErikMcClure pushed a commit to innative-sdk/spec that referenced this pull request Jun 15, 2020
rossberg referenced this pull request in effect-handlers/wasm-spec Feb 15, 2021
dhil pushed a commit to dhil/webassembly-spec that referenced this pull request Mar 2, 2023
dhil added a commit to dhil/webassembly-spec that referenced this pull request Mar 2, 2023
Add explainer document for Typed Continuations proposal.

Co-authored-by: Sam Lindley <Sam.Lindley@ed.ac.uk>
Co-authored-by: Andreas Rossberg <rossberg@chromium.org>
rossberg pushed a commit that referenced this pull request Mar 7, 2024
It seems like a wrong copy-and-paste from Firefox.
dhil pushed a commit to dhil/webassembly-spec that referenced this pull request Apr 12, 2024
rossberg pushed a commit that referenced this pull request Sep 4, 2024
Add template for suggesting new instructions.
rossberg added a commit that referenced this pull request Nov 6, 2024
* Support for the Memory64 proposal in the spec interpreter

* Apply suggestions from code review

Co-authored-by: Andreas Rossberg <rossberg@mpi-sws.org>

* Memory64 code review fixes

* Update interpreter/syntax/types.ml

Co-authored-by: Andreas Rossberg <rossberg@mpi-sws.org>

Co-authored-by: Andreas Rossberg <rossberg@mpi-sws.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants