Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds 'include' command to include another bats file #42

Closed
wants to merge 11 commits into from

Conversation

nkakouros
Copy link
Contributor

BATS already has the load option to load and run bash code from another file. This PR adds an include option to embed external bats scripts at the position where the include statement is. If this takes the go ahead, I will implement tests and documentation.

@mbland mbland requested review from mbland and a team December 4, 2017 22:52
@mbland
Copy link
Contributor

mbland commented Dec 4, 2017

Two things: First, it appears your base branch is out of date. Can you please rebase on the latest master?

Second, this is an interesting idea, but can you explain a bit more regarding why you would need this behavior, as opposed to running the tests in the included file directly?

@nkakouros
Copy link
Contributor Author

I was accidentally working on the stable branch. I rebased my commit on the master branch.

My use case is this. I am using ansible to provision some machines online. I do this by loading a lot of custom roles. If you are not familiar with ansible roles, just think of them as modules of intended functionality, eg one role installs the apache server, another mysql, etc. For these roles, I have written small bats tests that ping/wget/ssh/etc to the online servers. These tests contain only test functions that do a very specific thing, eg wget the homepage from the apache server. Then, I have a root bats script that has a setup that setups a vpn connection, sets the active server for the test, etc. The smaller tests don't care about connection details. For instance, the apache tests just needs a variable that holds an ip address and it simply wgets from this ip. So, I want to dynamically include the smaller tests from my main bats file to end up with a complete test scenario. Also, depending on the server that I want to test, I dynamically include the tests that are relevant to that server, eg if it is an FTP server I load the tests that come with the FTP role instead of the apache tests. Hope the above is clear enough!

Copy link
Contributor

@mbland mbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, but I'm still having a difficult time envisioning exactly how this plays out. Can you include some sample code?

Also, could you not achieve the same outcome by having separate scripts that run a subset of Bats tests for each role/server/scenario? In other words, can the complexity be managed effectively at a higher level, rather than pushing it into Bats itself?

elif [[ "$line" =~ $BATS_INCLUDE_PATTERN ]]; then
file_to_include="${BASH_REMATCH[1]}"
file_to_include=$(eval "echo \"${BASH_REMATCH[1]}\"")
if [ -f "$file_to_include" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use [[ and ]] throughout.

file_to_include="${BASH_REMATCH[1]}"
file_to_include=$(eval "echo \"${BASH_REMATCH[1]}\"")
if [ -f "$file_to_include" ]; then
cat $file_to_include
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use this instead of cat to avoid spawning a process:

printf '%s\n' "$(< "$file_to_include")"

@@ -331,7 +331,20 @@ BATS_OUT="${BATS_TMPNAME}.out"

bats_preprocess_source() {
BATS_TEST_SOURCE="${BATS_TMPNAME}.src"
. bats-preprocess <<< "$(< "$BATS_TEST_FILENAME")"$'\n' > "$BATS_TEST_SOURCE"
BATS_TEST_SOURCE_WIP="${BATS_TMPNAME}.wip"
{ tr -d '\r' < "$BATS_TEST_FILENAME"; echo; } | bats-preprocess > "$BATS_TEST_SOURCE_WIP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this block of changes necessary? Also, there are a lot of processes getting spawned here; please rewrite them using Bash-only constructs. (See #8 for the rationale on eliminating every subshell possible. TL;DR: This can have a significant impact on performance, especially on Windows.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a bats script, this code will go through the file and if there is a @test or include statement it will call bats-preprocess. After the first round, it will scan the generated file again for @test or include statements. This is done until no such statements are encountered in the file. This is needed as an included bats script can have @test or include commands of its own, allowing for multiple levels of inclusion.

There are indeed many processes and files involved.

  • As we cannot read and write to the same file at the same time, we need at minimum two files; one to read and one to write to. I am using the $BATS_TEST_SOURCE and the $BATS_TEST_SOURCE_WIP files as the source and destination files, alternating their role in each loop round.
  • I could combine the two greps into one. To avoid the redirection, I could also make grep silent with -q but that would mess up the error exit codes. Redirecting to /dev/null/ should be fast anyway. Is it possible to grep the file using only bash constructs?
  • The mv is the fastest option I could think of from the different alternatives. It doesn't open a file and it basically renames.
  • Another note, line 342 should be changed according to your commit cf9a3b8.

Compared to bats on the master branch, this will add one call to grep and one to mv when processing a file without an include statement.

I cannot think of any other optimization, other than using a different approach altogether. Do you think there could be another approach to implementing this? Maybe loading the initial file contents in a variable and manipulating it instead of using files?

@nkakouros
Copy link
Contributor Author

I rewrote the heavy part using only bash internals and doing away with all the extra writing to file. For that, I had to modify the regex for matching @test blahblah {.. lines. Here are some simple benchmarks. I run the tests that come with bats 6 times in a row.

1st round:

master branch original commit new approach
real 0m18.675s real 0m21.685s real 0m18.946s
user 0m14.150s user 0m16.783s user 0m14.523s
sys 0m3.683s sys 0m4.485s sys 0m3.634s

2nd round

master branch original commit new approach
real 0m18.611s real 0m22.533s real 0m19.111s
user 0m14.212s user 0m17.156s user 0m14.586s
sys 0m3.435s sys 0m4.621s sys 0m3.581s

Compared to the master branch, the main overhead I guess comes from the two additional regex matches. Thus, further optimization should happen towards that direction.

Later, I will upload some example code where the include statement comes in handy. Even in my use case, include is not essential, but it really adds flexibility to on the one hand dynamically create tests and on the other to organize and reuse test code.

@nkakouros
Copy link
Contributor Author

nkakouros commented Dec 6, 2017

I also tweaked the two regex's I introduced to treat the greedy star and I removed capture groups as I didn't need them. I repeated the time calculations and if I am not mistaken this PR adds 2% overhead compared to the master branch.

The issue with the new approach is that if the test files are large then it takes more time compared to the previous versions where I used grep. But am I wrong to assume that test files are not that large usually? Maybe a couple megabytes at most?

@nkakouros
Copy link
Contributor Author

nkakouros commented Dec 6, 2017

Here is an example of my use case.

I have dozens of ansible roles. Each role has its own bats test. Eg a role that installs and configures OpenVPN has the following accompanying test:

@test "Connect to VPN" {
  local -i retry_interval=5
  local -i retries=3
  local -i max_time=$retry_interval*$retries

  run openvpn \
    --cd $config['dir'] \
    --config $config['file'] \
    --connect-retry $retry_interval \
    --connect-retry-max $retries \
    --daemon

  dev=config['dev']

  SECONDS=0
  while ! grep $dev <<< "$(ip link show up)"; do
    if  [ $SECONDS -gt $max_time ]; then
      break
    fi
    sleep $(($retry_interval+1))
  done

  [ "$status" -eq 0 ]
}

As you can see, this test expects a config array. Whenever I need to run this test, I supply this array. Now, I have test.sh file on the root of my project that says:

#!/usr/bin/env bash

function check_requirements {
  # code to check for existence of some needed tools
}

function read_connection_param {
  # reads ansible files and returns the ip of the server I want to test
}

function read_server_roles {
  # reads ansible files and finds which roles have been applied to a server
}

function load_role_tests {
  # returns the bats tests that accompany each role
}

server=$SERVER  # passed as an environment variable
config=$CONFIG # contains lots of stuff, not only the information that the OpenVPN test above makes use of
check_requirements
ip=$(read_connection_params $server)
roles=$(read_server_roles $server)

declare -a tests
for role in "${roles[@]}"; do
  role_test="$(load_role_tests $role)"
  tests["${#test[@]}"]=$role_test
  # just a showcase here, there can be multiple tests per role, bash details, etc
done  
compound_test=''
for test in "${tests[@]}"; do
  compound_test="$compound_test"$'\n'"$(< "$test")"
done
echo "$compound_test" > /tmp/test

bats ./test.bats

and a test.bats that says:

setup() {
...
}

@test "General tests, that are not specific to a server"...

include /tmp/test

teardown() {
  rm -rf /tmp/$server
}

Of course, I could have written manually the resulting test scenarios for each server, but there are lots of possible combinations of ansible roles and it would be really impractical to edit/create test files any time a server configuration changes. Moreover, I am planning to make even ansible role selection random so that it is impossible for me to know what a server will run. (If you wonder what is the real-life application of the above, we are running a pentesting platform and we want different components to be loaded each time a new user launches the platform). Finally, I could have made the above a simple shell script where the include /tmp/$server line could be bats /tmp/test but that would be less performant as there would be multiple calls to bats and it would produce several test reports instead of one total report per server.

Hope the above makes sense. The code above is not the production code I use, just some abstraction to showcase the use case. Surely, the above could be written differently and not require the include keyword. But if the overhead can be made as minimal as possible, it would add some nice flexibility.

Final note, if the original idea takes the green light, I would like to add the ability to include whole directories and discuss maybe renaming include to @include.

@mbland
Copy link
Contributor

mbland commented Dec 7, 2017

Thanks for the detailed illustration and for doing all the extra work I asked for. This is definitely worth considering further. (And thanks for the patience—as I mentioned in another issue, I've been stealing a few minutes each morning while traveling for work this week.)

That said, we're basically freezing new features until we get 0.5.0 out per #16. (Hopefully soon!)

In that time, I want to study this a bit more deeply, and give @bats-core/bats-core members and other interested folks a chance to evaluate this and weigh in.

Also, an alternative to consider: If you run bats "${tests[@]}" from test.sh (instead of building /tmp/test and ./test.bats), bats should pass all those test files to bats-exec-suite, which then executes bats-exec-test as many times as necessary, but produces a single test report. Would this, in combination with a common load test_helper (where test_helper.sh defines setup() and teardown()), not achieve the same effect?

@nkakouros
Copy link
Contributor Author

nkakouros commented Dec 7, 2017

Hm.. Yes, indeed! So, for my use case, bats already does what I need. But maybe it is still useful to have. Waiting for your and the others' feedback.

@mrrusof
Copy link

mrrusof commented Mar 20, 2018

I would love to use an include instruction. My use case is the following.

I am testing a command with several different test fixtures. For that purpose I write a bats file per fixture and corresponding test cases. I often find that I have to write the same test in different files because there is no easy way to just inline that test case, for example @test "Output is JSON object consisting of ..." { ... }.

Right now I cope with the problem in the following way. For each shared test, I write the body of the test as a Bash function in a separate file (e.g. function output_is_json { ... }, I load the file in the corresponding bats files, and write a test stub in each file (e.g. @test "Output is JSON ..." { output_is_json }).

When there is already a way that makes my work easier, I'm glad to hear about it. Otherwise, an include instruction seems very useful for structuring bats files.

Kudos to @tterranigma for his effort and vision.

@nkakouros
Copy link
Contributor Author

Would there be interested in merging this?

@nkakouros nkakouros closed this Jan 21, 2020
@martin-schulze-vireso martin-schulze-vireso mentioned this pull request Nov 6, 2020
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants