Refactor the testing scripts #411

supercooldave · 2016-05-12T09:00:56Z

Different parts of the test directory use different scripts.
Matching warning/error messages on successful and unsuccessful compilation in test scripts is more complex than it needs to be.

The test script needs to be called in all test directories, and not copied. Support for checking for errors needs to be simplified.

kaeluka · 2016-05-19T12:04:06Z

I'd appreciate input on this. What's working for you, what isn't? Are there use cases that we're not handling well? And so on.

supercooldave · 2016-05-19T12:10:22Z

Need to be able to test for

successful completion — status and/or output and/or partial compiler output match
unsuccessful compilation — partial match compiler error.

One location for all the scripts, rather than one in each testing directory.

The current most “advanced” scripts are in the typesyn directory.
This script supports most of what I’ve written above, but you need to write additional scripts to match compiler output rather than just write the output you expect to match.

EliasC · 2016-05-19T13:29:58Z

I would like to be able to add a new test suite by creating a folder and filling it with .enc-files and .out-files (or similar). I could go as far as adding a line to some script somewhere, but ideally it shouldn't be more work than that.
I want to see the results of running all the tests without having to scroll back in the terminal. I want something like
```
Total tests passed: 98/100
Total tests failed: 2/100
  foo.enc [in suite basic]
  bar.enc [in suite typesyn]
```
I'm fine with scrolling to find the reason the tests failed as printing all the reasons in the end will likely clutter the results.

supercooldave · 2016-05-21T11:19:41Z

+1 for @EliasC's comments. The report should be printed at the end.

Also, it would be good to exclude files or directories – in modules I import some files, but don't want these files to be run as tests.

albertnetymk · 2016-05-21T14:13:51Z

I am possibly spoiled by rspec, but I would like to have sth like below:

multiple tests in one file

having input source and output as embedded string and isolate them in different testcases instead of building namespace on top of filesystem directly

positive result

input = """
class Main
  def main() : void
    print "hello world"
"""
c_stdout, c_status = encore_compile input

assert c_status == 0
assert c_stdout == ""

r_stdout, r_status = encore_run input

assert r_status == 0
assert c_stdout == "hello world"

negative result

input = """
class Main
  def main() : int
    print "hello world"
"""
c_stdout, c_status = encore_compile input

assert c_status == 1
assert c_stdout.contains "Type 'void' does not match expected type 'int'"

input = """
-- some deterministically failing tests in runtime
"""
c_stdout, c_status = encore_compile input

assert c_status == 0
assert c_stdout == ""

r_stdout, r_status = encore_run input

assert r_status == 1
assert c_stdout.contains "assertion blabla failed"

kaeluka · 2016-05-21T14:33:17Z

I don't see why the input should be a string. Then you can't import files, lose editor support, plus we'll have to implement a "file system" instead of using the file system that we already have.

I agree that there should be a good way to demand that:

an assertion failed,
the error code equals 0/1
the output contains "foo"
the compilation error contains "foo"
compilation successful

Ideally, the test only runs once, even though we have more than one assertion.

Summarising the input so far:

The list above
There should be a nice summary at the end.
Making a new test suite should not require to edit any configuration files -- just make a folder with encore files and a file describing the expected output
one location for all the scripts

kaeluka · 2016-05-21T14:39:42Z

@supercooldave:

Also, it would be good to exclude files or directories – in modules I import some files, but don't want these files to be run as tests.

what do you mean by that?

supercooldave · 2016-05-21T14:47:53Z

@kaeluka I mean that when I test the modules (in directory modules) some files are test .enc files others are just for importing and don't correspond to a test.

kaeluka · 2016-05-21T14:49:56Z

@supercooldave: OK. The test scripts will distinguish by the former ones having a corresponding .out file, and the latter ones not having them. Does that sound about right?

supercooldave · 2016-05-21T14:50:14Z

Unit tests, rspec, Quickcheck etc will all be great in the future, but the goal of this fix is to improve what we have in a simple way to eliminate some minor deficiencies.

supercooldave · 2016-05-21T14:51:18Z

@kaeluka Almost. a .out, .chk or .fail file indicates a test exists.

kaeluka · 2016-05-21T15:00:35Z

@supercooldave: agreed! Thanks for pointing that out!
@albertnetymk: I still think that there should be an encore test framework :) I also think it would be very nice to be able to compile encore classes at runtime. This could pbly be done somehow using libllvm. Probably a lot of work though.

albertnetymk · 2016-05-21T15:05:47Z

@kaeluka I didn't see why compile encore classes at runtime requires libllvm. I was thinking just calling shell commands in the testing language would be enough.

kaeluka · 2016-05-21T15:09:35Z

@albertnetymk: I shouldn't have brought this up in this thread. We should discuss this outside of the thread :) Edit: messaged you on slack.

albertnetymk · 2016-05-21T15:31:52Z

@kaeluka Having multiple tests in the same file is essential for progressively testing a feature, IMO.

testcase1: compiler don't complain for the new language construct
testcase2: compiler complains for the wrong usage (parser or typechecker)
testcase3: compiler generates executable
testcase4: executable runs and gives expected output

As for testing module, we could create module files on the fly, and remove them when the test is done.

kaeluka · 2016-05-21T15:35:29Z

With the current design, you'd make a directory, called feature. Within that directory, you implement all your testcases, feature/testcase1.enc|.out, feature/testcase2.enc|.chk, and so on. In the end, you see a summary that tells you which test cases failed. You'll be able to structure your tests in exactly the way you want, just using a file per test. This has the advantage that it keeps the testing scripts much simpler, and that you can use syntax highlighting (and whatever tooling we build in the future) when writing or reading tests.

albertnetymk · 2016-05-21T15:41:47Z

Agree to disagree. For the record, I don't believe having directory-per-feature is practical.

kaeluka · 2016-05-21T15:43:34Z

Having the option is practical. If you don't want a directory, you're still able to write one monolithic test.

supercooldave · 2016-05-21T16:01:11Z

In practice, we haven't had a directory-per-feature. We've had most features in one directory, which is even more impractical.

The changes @kaeluka will do will lift the testing framework to the next level, not to the final level.

kaeluka · 2016-05-21T16:02:49Z

The changes @kaeluka will do will lift the testing framework to the next level, not to the final level.

+12

supercooldave · 2016-05-21T18:00:08Z

In case this wasn't said above, the .fail and .chk files should not be scripts, but the text to partially match compiler output against in the case of failure and success, respectively. The current approach is to run a script, but this isn't easy for script-dopes like me.

supercooldave · 2016-05-23T13:43:05Z

@kaeluka I've changed my mind about the way files are excluded from tests.
The current suggestion is "if there's not .out file etc" then the corresponding .enc file is not a test.
This does not help address the problem of disabling tests, which is needed if a feature is broken for a while.
Proposed solution: a file EXCLUDING.txt or something like that, that includes the names of .enc files (and directories) that are not to be considered test cases (or test case directories). All other .enc files and subdirectories should be considered as containing tests.

kaeluka · 2016-05-23T14:27:25Z

I don't like this solution... It's very hard to discover that magical file name if you're not aware of it. Plus, the testing framework's code would become more bug prone...
What about just renaming the file to foo.out.disabled? That would still work, even with support for EXCLUDING.txt. What do others think?

supercooldave · 2016-05-23T14:40:21Z

I don't see how that is any better.
Some directories (modules) will be cluttered with .disabled files for files that won't even have a .out file.
With your scheme, how do I disable a subdirectory?

The magic filename problem can be solved by just putting an empty EXCLUDING.txt in each test directory, or reading the manual or relying on a spot of nous

kaeluka · 2016-05-23T14:44:06Z

This is better by not officially sanctioning disabling tests. If you want to disable so many tests, the code should not be merged, or the test directories deleted. Having lots of dead code is just going to create a mess. The tests can be re added (or the code merged) once the bugs have been fixed.

But tests are supposed to prevent bugs from going in -- our response to tests not passing should not be to disable test suites.

supercooldave · 2016-05-23T14:45:25Z

I think you are missing the point. The module system requires files not to be included as tests, but the files need to be there. Some of these files will be in subdirectories.

I'm not just making these cases up.

supercooldave · 2016-05-23T15:00:08Z

One final thing. It would be useful to be able to type "make" in any of the subdirectories and have only those tests run.

albertnetymk · 2016-05-23T15:07:10Z

If you want to disable so many tests, the code should not be merged, or the test directories deleted.

It's not uncommon to disable a directory of tests, for one language feature requires multiple tests, and they need to be grouped within one folder if using fs directly. (One example we encountered is disabling async keyword.)

It's true that we could delete the whole folder if we want to disable them, but I guess the reasoning goes for disabling a single test as well. However, that would be cumbersome.

supercooldave · 2016-05-23T15:34:57Z

I agree with @albertnetymk. I don't think deleting failing test cases is a good thing. We'd too easily lose track of them.

supercooldave added the enhancement label May 12, 2016

supercooldave added this to the May'16 Sprint milestone May 12, 2016

supercooldave added the tiny label May 12, 2016

supercooldave assigned supercooldave and kaeluka and unassigned supercooldave and kaeluka May 12, 2016

kaeluka assigned kaeluka and unassigned supercooldave May 19, 2016

kaeluka mentioned this issue May 24, 2016

Embedding C libraries breaks if compiling from another directory. #439

Closed

TobiasWrigstad added the ready for review label May 24, 2016

supercooldave closed this as completed in 705d1d4 May 26, 2016

supercooldave removed the ready for review label May 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the testing scripts #411

Refactor the testing scripts #411

supercooldave commented May 12, 2016

kaeluka commented May 19, 2016

supercooldave commented May 19, 2016

EliasC commented May 19, 2016

supercooldave commented May 21, 2016

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016 •

edited

Loading

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

supercooldave commented May 21, 2016

kaeluka commented May 21, 2016

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016 •

edited

Loading

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016 •

edited

Loading

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

supercooldave commented May 23, 2016

kaeluka commented May 23, 2016

supercooldave commented May 23, 2016

kaeluka commented May 23, 2016

supercooldave commented May 23, 2016 •

edited

Loading

supercooldave commented May 23, 2016

albertnetymk commented May 23, 2016

supercooldave commented May 23, 2016

Refactor the testing scripts #411

Refactor the testing scripts #411

Comments

supercooldave commented May 12, 2016

kaeluka commented May 19, 2016

supercooldave commented May 19, 2016

EliasC commented May 19, 2016

supercooldave commented May 21, 2016

albertnetymk commented May 21, 2016

multiple tests in one file

positive result

negative result

kaeluka commented May 21, 2016 • edited Loading

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

supercooldave commented May 21, 2016

kaeluka commented May 21, 2016

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016 • edited Loading

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016 • edited Loading

albertnetymk commented May 21, 2016

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

kaeluka commented May 21, 2016

supercooldave commented May 21, 2016

supercooldave commented May 23, 2016

kaeluka commented May 23, 2016

supercooldave commented May 23, 2016

kaeluka commented May 23, 2016

supercooldave commented May 23, 2016 • edited Loading

supercooldave commented May 23, 2016

albertnetymk commented May 23, 2016

supercooldave commented May 23, 2016

kaeluka commented May 21, 2016 •

edited

Loading

kaeluka commented May 21, 2016 •

edited

Loading

kaeluka commented May 21, 2016 •

edited

Loading

supercooldave commented May 23, 2016 •

edited

Loading