Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing framework #3

Open
ladc opened this issue Feb 4, 2016 · 27 comments
Open

Testing framework #3

ladc opened this issue Feb 4, 2016 · 27 comments

Comments

@ladc
Copy link
Collaborator

ladc commented Feb 4, 2016

Agree on a testing library and a test runner with minimal dependencies, preferably Lua/shell only.
Test cases should still be easy to run without any (big) dependencies.

@agentzh
Copy link
Collaborator

agentzh commented Feb 4, 2016

As mentioned in #5, I hope that with the data driven approach, we can provide an official test runner (or test harness) but also allow 3rd-party more sophisticated test runners to run exactly the same test suite without modifications. The test cases themselves should be test framework agnostic and stay very clean and self-contained. This also has the advantage that there is no way that a Lua test framework library itself can affect the behaviors of the test cases in any way if we wish.

@ladc
Copy link
Collaborator Author

ladc commented Feb 22, 2016

@agentzh I've made a proof-of-concept test runner in the spirit of TestML and following the guidelines by @MikePall in his comment to issue #5. I've adapted some tests as an example (in the directories test/libs and test/libs_ext).

Like ldoc it considers the --- as special, in this case a marker for a new test. The first line defines the test name. Subsequent comments preceded by -- contain the description, and tags are added in the description +this +way. The test file's path is also tokenized and added to the tags.

The tags allow you to select or exclude certain tests, this allows for 'tiers' (slow/fast/lua52 etc.).

It also considers the code before the first --- identifier as a prelude which should be included before each subsequent test.

It's here: https://github.com/ladc/LuaJIT-test-cleanup
Try running

luajit test.lua test/libs +table -lua52

or

luajit test.lua --help

Let me know what you think.

It still has a lot of TODOs of course, like the filenames used to extract failing code snippets to (currently uses os.tmpname()), etc.

@fsfod
Copy link
Collaborator

fsfod commented Feb 22, 2016

It seems kind of fragile trying to parse out the bodys of the test and append what you hope the helper functions are instead of just rewriting all the individual tests in a file into functions that you can just selectively call

@ladc
Copy link
Collaborator Author

ladc commented Feb 22, 2016

@fsfod Thank you for your feedback. I tried to find a solution that's minimally invasive to the tests, while avoiding the need to include a YAML parser or the like, and while allowing the tests to be run as-is.

Of course a declarative framework like this would need a minimal specification for the tests' format, which would include that the chunk before the first --- contains the helper functions. And some tests would need to be adapted to comply to this.

@Wiladams
Copy link
Collaborator

I wonder why lua itself could not be the test format? Why rely on something else, when lua is perfectly capable of doing the job?

Sent from my Windows Phone


From: Lesley De Cruzmailto:notifications@github.com
Sent: ‎2/‎22/‎2016 1:49 PM
To: LuaJIT/LuaJIT-test-cleanupmailto:LuaJIT-test-cleanup@noreply.github.com
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

@fsfod Thank you for your feedback. I tried to find a solution that's minimally invasive to the tests, while avoiding the need to include a YAML parser or the like, and while allowing the tests to be run as-is.

Of course a declarative framework like this would need a minimal specification for the tests' format, which would include that the chunk before the first --- contains the helper functions. And some tests would need to be adapted to comply to this.


Reply to this email directly or view it on GitHub:
#3 (comment)

@ladc
Copy link
Collaborator Author

ladc commented Feb 23, 2016

@Wiladams That would require more invasive changes to the tests, and no longer make them runnable as-is. Or it would require putting every (sub)test into a different file, which is not ideal either.

@CapsAdmin
Copy link

Coming up with names for all the subtests seems a bit difficult if they were to be split into files but I guess you could just name them table/insert1.lua table/insert2.lua etc instead.

I think parsing a test for subtests is fine. I would match subtests by balanced do end and leave the top part of the test as shared code. For instance https://github.com/LuaJIT/LuaJIT-test-cleanup/blob/master/test/ffi/ffi_new.lua has 10 sub tests where line 1 to 11 would be each subtests header. Then moving your comment config thing inside do end or left ofdo` while adding a symbol of some sort to tell the people reading the code that this is something that's being parsed would make it seem a little less "fragile".

I don't know how important it is to not have dependencies but if you're going to use a stable version of luajit to run the tests you could just use ffi to iterate directories instead of relying on lfs. But that's something that should probably be looked at later.

@ladc
Copy link
Collaborator Author

ladc commented Feb 26, 2016

OK, the --- divisor is indeed a bit fragile since the subtests can still leak (local) variables if they don't have do ... end delimiters. How would you match this in a robust way without adding too much machinery?

It's indeed better to drop the dependency on lfs in favour of an ffi solution (see ljsyscall and TINN).

@ladc
Copy link
Collaborator Author

ladc commented Feb 26, 2016

I wrote

It's indeed better to drop the dependency on lfs in favour of an ffi solution (see ljsyscall and TINN).

OTOH if you're not testing a stable LuaJIT (or if you're in the process of porting it) you may appreciate a test runner that works in vanilla Lua...

@Wiladams
Copy link
Collaborator

Or minilua

Sent from my Windows Phone


From: Lesley De Cruzmailto:notifications@github.com
Sent: ‎2/‎26/‎2016 3:22 PM
To: LuaJIT/LuaJIT-test-cleanupmailto:LuaJIT-test-cleanup@noreply.github.com
Cc: William Adamsmailto:william_a_adams@msn.com
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

I wrote

It's indeed better to drop the dependency on lfs in favour of an ffi solution (see ljsyscall and TINN).

OTOH if you're not testing a stable LuaJIT (or if you're in the process of porting it) you may appreciate a test runner that works in vanilla Lua...


Reply to this email directly or view it on GitHub:
#3 (comment)

@ladc
Copy link
Collaborator Author

ladc commented Feb 27, 2016

@Wiladams I also considered minilua. But keep in mind that minilua lacks some core libraries and functions. If I understand correctly from genminilua.lua the following list is not included:

  collectgarbage dofile gcinfo getfenv getmetatable load print rawequal rawset
  select tostring xpcall
  foreach foreachi getn maxn setn
  popen tmpfile seek setvbuf __tostring
  clock date difftime execute getenv rename setlocale time tmpname
  dump gfind len reverse

@CapsAdmin
Copy link

@ladc instead of trying to come up with a parser that matches do end you do something like

--#header foo {
local ffi = require("ffi")
local bit = require("bit")

dofile("../common/ffi_util.inc")

ffi.cdef([[
typedef struct { int a,b,c; } foo1_t;
typedef int foo2_t[?];
void *malloc(size_t size);
void free(void *ptr);
]])
--#}

do 
  --#subtest test {
  --#include foo
  assert(ffi.sizeof("foo1_t") == 12)
  local cd = ffi.new("foo1_t")
  assert(ffi.sizeof(cd) == 12)
  local foo1_t = ffi.typeof("foo1_t")
  assert(ffi.sizeof(foo1_t) == 12)
  cd = foo1_t()
  assert(ffi.sizeof(cd) == 12)
  --#}  
end 

do --#subtest test2 {
  --#include foo
  assert(ffi.sizeof("foo2_t", 3) == 12)
  local cd = ffi.new("foo2_t", 3)
  assert(ffi.sizeof(cd) == 12)
  local foo2_t = ffi.typeof("foo2_t")
  fails(ffi.sizeof, foo2_t)
  assert(ffi.sizeof(foo2_t, 3) == 12)
  cd = foo2_t(3)
  assert(ffi.sizeof(cd) == 12)
  --#}  
end 

do --#subtest test2 {
  --#include foo
  local tpi = ffi.typeof("int")
  local tpb = ffi.typeof("uint8_t")
  local t = {}
  for i=1,200 do t[i] = tpi end
  t[100] = tpb
  local x = 0
  for i=1,200 do x = x + tonumber(ffi.new(t[i], 257)) end
  assert(x == 199*257 + 1)
  --#}  
end

I think the prefix symbol makes it clear that this is something more than a comment.

Instead of making a pseudo language you could use lua itself somehow too which at that point it becomes a generic macro thing. (--#header = [[ ... --#]] ... --#test1 = [[ ... --#include(header) ... --#]]) but I'm not sure if there's going to be much benefit or if it'll make things easier.

@Wiladams
Copy link
Collaborator

Here's an example of what I'm thinking. Just deal with the test file as if it were a database of individual test cases.

This is all pure lua, so a lua parser can deal with it. And if you want to split thing out, you can easily do that by just

querying the table, and only running what you want.

The added benefit is that you get more meta data associated with tests. So, if you want to run tests that are related to a particular area, you can easily do that.

I just want to separate the difference between data, and the mechanism to run the tests. The data itself can easily be represented in lua itself, minimizing the need to create a different 'language' simply to represent the data.

local testCases = {
{

id = "constov1",

desc = "test case to catch issue 2345",

author = "williamaadams",

issue = "2345",


[[
  local t = { "local x\n" }
   for i=2,65537 do t[i] = "x="..i..".5\n" end
  assert(loadstring(table.concat(t)) ~= nil)
   t[65538] = "x=65538.5"
   assert(loadstring(table.concat(t)) == nil)

]]

},

{

id = "constov2",

desc = "test case to catch issue 2346",

issue = "2346",

[[

  local t = { "local x\n" }


  for i=2,65537 do t[i] = "x='"..i.."'\n" end


  assert(loadstring(table.concat(t)) ~= nil)


  t[65538] = "x='65538'"


  assert(loadstring(table.concat(t)) == nil)

]]

}
}

if not os.getenv("SLOWTEST") then return end

-- Run all the test cases

for tcase in ipairs(testCases) do

for idx, tcase in ipairs(tcase) do

dostring(tcase)

end

end

=============================== - Shaping clay is easier than digging it out of the ground.

Date: Sat, 27 Feb 2016 06:08:52 -0800
From: notifications@github.com
To: LuaJIT-test-cleanup@noreply.github.com
CC: william_a_adams@msn.com
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

@ladc instead of trying to come up with a parser that matches do end you do something like

--#header foo {
local ffi = require("ffi")
local bit = require("bit")

dofile("../common/ffi_util.inc")

ffi.cdef([[
typedef struct { int a,b,c; } foo1_t;
typedef int foo2_t[?];
void *malloc(size_t size);
void free(void *ptr);
]])
--#}

do
--#subtest test {
--#include foo
assert(ffi.sizeof("foo1_t") == 12)
local cd = ffi.new("foo1_t")
assert(ffi.sizeof(cd) == 12)
local foo1_t = ffi.typeof("foo1_t")
assert(ffi.sizeof(foo1_t) == 12)
cd = foo1_t()
assert(ffi.sizeof(cd) == 12)
--#}
end

do --#subtest test2 {
--#include foo
assert(ffi.sizeof("foo2_t", 3) == 12)
local cd = ffi.new("foo2_t", 3)
assert(ffi.sizeof(cd) == 12)
local foo2_t = ffi.typeof("foo2_t")
fails(ffi.sizeof, foo2_t)
assert(ffi.sizeof(foo2_t, 3) == 12)
cd = foo2_t(3)
assert(ffi.sizeof(cd) == 12)
--#}
end

do --#subtest test2 {
--#include foo
local tpi = ffi.typeof("int")
local tpb = ffi.typeof("uint8_t")
local t = {}
for i=1,200 do t[i] = tpi end
t[100] = tpb
local x = 0
for i=1,200 do x = x + tonumber(ffi.new(t[i], 257)) end
assert(x == 199*257 + 1)
--#}
end

I think the prefix symbol makes it clear that this is something more than a comment.

Instead of making a pseudo language you could use lua itself somehow too which at that point it becomes a generic macro thing. (--#header = [[ ... --#]] ... --#test1 = [[ ... --#include(header) ... --#]]) but I'm not sure if there's going to be much benefit or if it'll make things easier.


Reply to this email directly or view it on GitHub.

@CapsAdmin
Copy link

That's probably how I would do it personally too but if it's important (for some reason?) to keep the tests as is in single lua files then I would do what I said.

@ladc
Copy link
Collaborator Author

ladc commented Feb 27, 2016

@Wiladams That's a lot of boilerplate. If you need additional metadata that can easily be parsed from the comments or tags. You could use a simple @key:value pair syntax in the description.

I really don't see the need to add a lot of overhead to the tests. I think it's best to keep the tests clean and simple and to not end up with thousands of files (seriously, this can be very painful on some systems).

Please look at the example below:
https://github.com/ladc/LuaJIT-test-cleanup/blob/master/test/libs_ext/table.lua

The tests remain very readable and immediately runnable, while you still have a lot of flexibility, like adding metadata using the tags and possibly arbitrary attributes.

I'm also not sure if this 'fragility' is much of a concern? The files will be "cut" at --- comments and the individual chunks as such will still need to be valid.

The responsibility of not unintendedly leaking variables between the tests is with the writer of the tests, as is already the case. This will require some curation but this will be the case with any framework, and luacheck can certainly help there.

@ladc
Copy link
Collaborator Author

ladc commented Feb 27, 2016

@Wiladams I've implemented your suggestion to allow for more metadata, using the @key: value syntax. Right now it's not yet possible to filter on these attributes like for tags, but this can easily be added if needed. In any case the functionality is there and the interface should be stable enough.

Please read the requirements by @MikePall again; I think I've addressed many of them in the proposed implementation. Missing features include documentation of the current features, parallellization, shuffling of tests, timing the runs to look for potential performance regressions, running a test multiple times. And C(-API) test are not handled yet.

@Wiladams
Copy link
Collaborator

I didn't quite get the comment on "a lot of boilerplate". None of those attributes are required. If you leave them out, then what you have is essentially the same thing as what you're proposing, but, instead of using a special notation in the comments, I just use the language itself to indicate what's what. Saves me from having to require that special parser to separate out the meaning from the comments. I can just use plain old lua.

Also, as far as the MikePall 'requirements', I take that as Mike's strong suggestions. He's not here to drive the effort, and he's said as much. He's leaving it in our hands. I thought I was following those strong suggestions. minimal dependencies, tests can stand on their own, executed from plain a normal lua...

But, really, we should just go ahead and implement some things. We risk debating/prototyping ourselves into inaction. Not a single checkin since the original one...

@ladc
Copy link
Collaborator Author

ladc commented Feb 28, 2016

Rewriting the tests as Lua strings in a table would add unnecessary, repeated code, which to me is boilerplate. But a more important down side to me is that you lose a lot with this approach: syntax highlighting and other editor tools like luacheck or lua inspect don't work on the code inside the strings; the test files will compile even with syntax errors in the tests; you can't run the tests as-is. To be honest, I really don't look forward to refactoring the tests in this format.

The parser is not that special, it basically just matches three leading dashes and parses the comments for tags and attributes. It doesn't get much simpler than that. It's implemented in #6. The resulting table is basically the one you would define explicitly in your proposed format.

Anyway, the required test format can still be modified by replacing the parse() function in the module tester.lua if need be. Could you please have a look at the other features of the test runner I proposed? For example

  • one can filter tests by specifying the tags as in +tag1 -tag2; this could be used for the different 'tiers', or possibly to replace conditions like if os.getenv("LUA52") then ... end
  • to run the tests in a separate state, specify the command used, e.g. --runcmd="luajit -joff"
  • failed tests are extracted into the failed_tests dir (with an error report appended as a comment).

The command line tool's output to stdout is currently quite crude and the features are not complete. But if you think it could useful (test format aside), please consider merging my pull request #6.

@Wiladams
Copy link
Collaborator

I think you should just do a pull request and we can move from debate to reality. I'm not married to my idea as much as I was just demonstrating a concept.

Sent from my Windows Phone


From: Lesley De Cruzmailto:notifications@github.com
Sent: ‎2/‎28/‎2016 8:03 AM
To: LuaJIT/LuaJIT-test-cleanupmailto:LuaJIT-test-cleanup@noreply.github.com
Cc: William Adamsmailto:william_a_adams@msn.com
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

Rewriting the tests as Lua strings in a table would add unnecessary, repeated code, which to me is boilerplate. But a more important down side to me is that you lose a lot with this approach: syntax highlighting and other editor tools like luacheck or lua inspect don't work on the code inside the strings; the test files will compile even with syntax errors in the tests; you can't run the tests as-is. To be honest, I really don't look forward to refactoring the tests in this format.

The parser is not that special, it basically just matches three leading dashes and parses the comments for tags and attributes. It doesn't get much simpler than that. It's implemented in #6. The resulting table is basically the one you would define explicitly in your proposed format.

Anyway, the required test format can still be modified by replacing the parse() function in the module tester.lua if need be. Could you please have a look at the other features of the test runner I proposed? For example

  • one can filter tests by specifying the tags as in +tag1 -tag2; this could be used for the different 'tiers', or possibly to replace conditions like if os.getenv("LUA52") then ... end
  • to run the tests in a separate state, specify the command used, e.g. --runcmd="luajit -joff"
  • failed tests are extracted into the failed_tests dir (with an error report appended as a comment).

The command line tool's output to stdout is currently quite crude and the features are not complete. But if you think it could useful (test format aside), please consider merging my pull request #6.


Reply to this email directly or view it on GitHub:
#3 (comment)

@ladc
Copy link
Collaborator Author

ladc commented Feb 28, 2016

Well, the code has already been written and the pull request has been sitting here for a week... See #6.

@fsfod
Copy link
Collaborator

fsfod commented Feb 28, 2016

I don't mind metadata that can be ignored but I still prefer wrap all the individual tests that are in a file in functions and use telescope to run them with optionally using your metadata to control what tests run and using it to declared special things to check like loops were JIT'ed instead of what I currently do. Maybe because I started with a different test setup to MikePaul but extracting failing test doesn't seem that useful to me based on my experience implementing intrinsic support for LuaJIT. I would just set a telescope test filter to only run failed tests I wanted based on the test name from the test declaration it("test name", function() end). I also ended up sticking what would be you tags in the test name as well

@Wiladams
Copy link
Collaborator

@ladc, just to be complete:
{
id = "constov2",
desc = "test case to catch issue 2346",
issue = "2346",
function()
local t = { "local x\n" }
for i=2,65537 do
t[i] = "x='"..i.."'\n"
end
assert(loadstring(table.concat(t)) ~= nil)
t[65538] = "x='65538'"
assert(loadstring(table.concat(t)) == nil)
end
}

This can also work. You don't need to put the test code into a literal string, it can be bracketed by anything that can show up discretely in a table. At this point, eliminating the meta data, and turning the curly braces into '---', we have exactly the same thing, except mine is parseable by lua directly, not requiring any sort of test parser.

@agentzh
Copy link
Collaborator

agentzh commented Mar 3, 2016

Using Lua for the test spec is very bad since it makes alternative test scaffolds written in other languages (like Perl and Python) much much harder. We need the capability to run the same test suite in various wildly different ways. BTW, TestML supports custom section delimiters other than ---:

http://testml.org/specification/language//index.html

I believe it's very wrong and limited to assume the test scaffold is always Lua.

@Wiladams
Copy link
Collaborator

Wiladams commented Mar 4, 2016

I'm not sure you're reading my comments correctly.
There is a difference between how you represent the test data, and how you run the test cases. What I've been trying to point out in this thread is that there's not much difference in my eyes between using:
--- annotation here

and:

{ annotation = "here"

In either case, you can still have a test runner in PHP or whatever language you choose. It just so happens that the annotations I'm selection are parseable using Lua as well.

The real question to me is "can we choose a test case format that is easily parseable by lua as well as other languages?" Is it necessary to make it harder for Lua to be the test case parser?

At any rate, I'm not pushing for anything here because I don't think it's worth the argument. The thing that should be selected is the thing we have tools for. It will no doubt change over time anyway.

@ladc
Copy link
Collaborator Author

ladc commented Mar 4, 2016

Whichever representation we choose right now, it'll be trivial to parse and reformat the tests in the format which we ultimately decide to go for.

@Wiladams
Copy link
Collaborator

Wiladams commented Mar 4, 2016

I agree. Its more important to get some momentum at the moment.

Sent from my Windows Phone

-----Original Message-----
From: "Lesley De Cruz" notifications@github.com
Sent: ‎3/‎4/‎2016 12:04 PM
To: "LuaJIT/LuaJIT-test-cleanup" LuaJIT-test-cleanup@noreply.github.com
Cc: "William Adams" william_a_adams@msn.com
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

Whichever representation we choose right now, it'll be trivial to parse and reformat the tests in the format which we ultimately decide to go for.

Reply to this email directly or view it on GitHub.

@lukego
Copy link

lukego commented Oct 11, 2016

Help! I really need a simple way to drive the tests for the CI in #10. The ideal would be:

  • Simple way to enumerate the names of all available benchmarks.
  • Simple way to run a benchmark by name (using some sane default parameters).
  • Uniform way to calculate a score for each benchmark e.g. elapsed time until process terminates (with non-zero exit status on failure).

Could any kind soul provide this?

The current solution is basically to enumerate the tests with ls bench/*.lua and to run them with luajit bench/$test.lua. This is not ideal though... not every Lua source file is a stand-alone test case(some contain none, some contain many), not every test has sane defaults (e.g. running for at least 0.1s to amortize startup costs), and not every test case can necessarily be scored by execution time (don't always do a fixed amount of work).

The drop-in solution would be to refactor bench/*.lua such that every file does run one test case, require no parameters, and execute a fixed amount of work. This would require splitting up bigger benchmarks (e.g. scimark) and moving library code into a subdirectory (also scimark). However that is just one possibility.

I am happy to take care of the CI side if somebody can support on the test suite side :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants