Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the bash scripts in python #1215

Closed
clanmills opened this issue May 21, 2020 · 38 comments
Closed

Rewrite the bash scripts in python #1215

clanmills opened this issue May 21, 2020 · 38 comments
Assignees
Labels
testing Anything related to the tests and their infrastructure
Milestone

Comments

@clanmills
Copy link
Collaborator

clanmills commented May 21, 2020

Proposal

Rewrite the bash scripts in python. The purpose of this proposal is to remove bash from the test suite.

Design

The bash scripts are structured. They have a common core in functions.source which we can easily convert to a Python module.

So, copyFiles a b becomes ET.copyFiles('a','b')

runTest exiv2 -pa $filename becomes ET.runTest("exiv2 -pa %s" % filename)

ET.runTest() executes a command and returns an array of strings (line-by-line output).

We collect the lines as the test proceeds:

r=[]
r=r+ET.runTest("...........")
# And at the end we compare r to test/data/geotag-test.out
ET.reportTest("geotag-test",r) 

To avoid piping data via grep, cut and other linux utilities, we implement them in python. So:

ET.cut(del,field,array-of-strings) returns an array of shorter string
ET.grep("pattern",array-of-strings) returns a shorter array of strings

When we test the output with ET.report(), when the test fails we write the file tmp/test.out and the output can be inspected with your favourite diff utility.

Benefits

1 Cross Platform.
2 Simpler design than tests/system_tests.py.
3 Can be introduced as time permits.
4 No documentation changes!
5 We know the test is identical because we do not touch data/test.out.
6 Eliminate line-ending issues.
7 Eliminate diff, dos2unix, tr, pipes and other unix hackery.
8 Binary output support.

Effort

The new scripts can be developed one at a time without breaking anything.
When a test is converted, we replace the code in foo-test.sh with python3 foo-test.py
The existing scripts can be easily converted to python. We have 25 scripts with 1424 lines of code (average 56 lines/script). This is a one-week job.

Example

Here's geotag-test.sh

#!/usr/bin/env bash
# Test driver for geotag

source ./functions.source

(   cd "$testdir"

    printf "geotag" >&3

    jpg=FurnaceCreekInn.jpg
    gpx=FurnaceCreekInn.gpx
    copyTestFiles $jpg $gpx

    echo --- show GPSInfo tags ---
    runTest                      exiv2 -pa --grep GPSInfo $jpg
    tags=$(runTest               exiv2 -Pk --grep GPSInfo $jpg  | tr -d '\r') # MSVC puts out cr-lf lines
    echo --- deleting the GPSInfo tags
    for tag in $tags; do runTest exiv2 -M"del $tag" $jpg; done
    runTest                      exiv2 -pa --grep GPS     $jpg
    echo --- run geotag ---
    runTest                      geotag -ascii -tz -8:00 $jpg $gpx | cut -d' ' -f 2-
    echo --- show GPSInfo tags ---
    runTest                      exiv2 -pa --grep GPSInfo $jpg

) 3>&1 > $results 2>&1

printf "\n"

# ----------------------------------------------------------------------
# Evaluate results
cat $results | tr -d $'\r' > $results-stripped
mv                           $results-stripped $results
reportTest                                     $results $good

# That's all Folks!
##

In python, it'll look something like this:

#!/usr/bin/env python3

# Test driver for geotag
import exiv2Test as ET;

r=[]
jpg="FurnaceCreekInn.jpg"
gpx="FurnaceCreekInn.gpx"

t=  "geotag-test"
print(t)

r=r+     ET.      copyTestFiles(jpg,gpx)

r=r+     ET.      echo("--- show GPSInfo tags ---")
r=r+     ET.      runTest("exiv2 -pa --grep GPSInfo %s" % jpg)

tags=ET. runTest("exiv2 -Pk --grep GPSInfo %s" % jpg)

r=r+     ET.      echo("--- deleting the GPSInfo tags")
for tag in tags:
    r=r+ ET.      runTest("exiv2 -M'del %s' %s" % (tag,jpg)
r=r+     ET.      runTest("exiv2 -pa --grep GPS  %s" %  jpg)
r=r+     ET.      echo("--- run geotag ---")
r=r+     ET.      cut(' ',2,ET.runTest("geotag -ascii -tz -8:00 %s %s" % jpg gpx))
r=r+     ET.      echo('--- show GPSInfo tags ---')
r=r+     ET.      runTest("exiv2 -pa --grep GPSInfo %s" % jpg)

ET.reportTest(r,t)

# That's all Folks!
##
@clanmills clanmills added the request feature request or any other kind of wish label May 21, 2020
@clanmills clanmills added this to the v0.27.4 milestone May 21, 2020
@clanmills clanmills self-assigned this May 21, 2020
@D4N
Copy link
Member

D4N commented May 21, 2020 via email

@clanmills
Copy link
Collaborator Author

@D4N I don't think you've understood my intention. I'm wondering: "Is there a quick and simple way to remove our dependancy on bash".

I admire system_tests.py. Our future test strategy should be based on systems_tests.py

So, this is a proposal for v.0.27.4 and not 0.28.

@clanmills
Copy link
Collaborator Author

clanmills commented May 22, 2020

This is straight-forward and works on macOS, Windows (msvc/mingw/cygwin), solaris and ubuntu.

#!/usr/bin/env python3

import os
import shlex
import shutil
import subprocess

def error(s):
    print('**',s,'**')
def warn(s):
    print('--',s)

def chop(blob):
    lines=[]
    line=''
    for c in blob.decode('utf-8'):
        if c == '\n':
            lines=lines+[line]
            line=''
        elif c != '\r':
            line=line+str(c)
    if len(line) != 0:
        lines=lines+line
    return lines

def runTest(r,cmd):
    lines=[]
    try:
        # print('*runTest*',cmd)
        p        = subprocess.Popen( shlex.split(cmd), stdout=subprocess.PIPE,shell=False)
        out,err  = p.communicate()
        lines=chop(out)
        if p.returncode != 0:
            warn('%s returncode = %d' % (cmd,p.returncode) )
    except:
        error('%s died' % cmd )
            
    return r+lines
    
def echo(r,s):
    return r+[s]

def copyTestFiles(r,a,b):
    os.makedirs('tmp', exist_ok=True)
    shutil.copy('data/%s' % a,'tmp/%s' % a)
    shutil.copy('data/%s' % b,'tmp/%s' % b)
    return r

def cut(r,delim,field,lines):
    R=[]
    for line in lines:
        i = 0
        while i < len(line):
            if line[i]==delim:
                R=R+[line[i+1:]]
                i=len(line)
            else:
                i=i+1
    return r+R;

def reportTest(r,t):
    good = True
    R=chop(open('data/%s.out' % t ,'rb').read())
    if len(R) != len(r):
        error('output length mismatch Referance %d Test %d' % (len(R),len(r)))
        good=False
    else:
        i = 0
        while good and i < len(r):
            if R[i] != r[i]:
                error ('output mismatch at line %d' % i)
                error ('Reference: %s' % R[i])
                error ('Test:      %s' % r[i])
            else:
                i=i+1
    if not good:
        f=open('tmp/%s.out' % t , 'w')
        for line in r:
            f.write(line+'\n')
        f.close()

    print('passed %s' % t) if good else error('failed %s' %t )
    
# Update the environment
key="PATH"
if key in os.environ:
    os.environ[key] = os.path.abspath(os.path.join(os.getcwd(),'../build/bin')) + os.pathsep + os.environ[key]
else:
    os.environ[key] = os.path.abspath(os.path.join(os.getcwd(),'../build/bin'))

for key in [ "LD_LIBRARY_PATH", "DYLD_LIBRARY_PATH" ]:
    if key in os.environ:
        os.environ[key] = os.path.abspath(os.path.join(os.getcwd(),'../build/lib')) + os.pathsep + os.environ[key]
    else:
        os.environ[key] = os.path.abspath(os.path.join(os.getcwd(),'../build/lib'))

r=[]
t=  'geotag-test'

warn('%s'       % t)
warn('pwd=%s'   % os.getcwd())
warn('exiv2=%s' % shutil.which("exiv2"))

jpg='FurnaceCreekInn.jpg'
gpx='FurnaceCreekInn.gpx'

r=          copyTestFiles(r,jpg,gpx)
r=          echo   (r,'--- show GPSInfo tags ---')
r=          runTest(r,'exiv2 -pa --grep GPSInfo tmp/%s' % jpg)

r=          echo   (r,'--- deleting the GPSInfo tags')
tags=       runTest([],'exiv2 -Pk --grep GPSInfo tmp/%s' % jpg)
for tag in tags:
     r=     runTest(r,'exiv2 -M"del %s" tmp/%s' % (tag,jpg))
r=          runTest(r,'exiv2 -pa --grep GPS  tmp/%s' %  jpg)
r=          echo   (r,'--- run geotag ---')
lines=      runTest([],'geotag -ascii -tz -8:00 tmp/%s tmp/%s' % (jpg,gpx))
r=          cut    (r,' ',2,lines)
r=          echo   (r,'--- show GPSInfo tags ---')
r=          runTest(r,'exiv2 -pa --grep GPSInfo tmp/%s' % jpg)
 
reportTest(r,t)

# That's all Folks
##

Looks very similar to bash:

    printf "geotag" >&3

    jpg=FurnaceCreekInn.jpg
    gpx=FurnaceCreekInn.gpx
    copyTestFiles $jpg $gpx

    echo --- show GPSInfo tags ---
    runTest                      exiv2 -pa --grep GPSInfo $jpg
    tags=$(runTest               exiv2 -Pk --grep GPSInfo $jpg  | tr -d '\r')
    echo --- deleting the GPSInfo tags
    for tag in $tags; do runTest exiv2 -M"del $tag" $jpg; done
    runTest                      exiv2 -pa --grep GPS     $jpg
    echo --- run geotag ---
    runTest                      geotag -ascii -tz -8:00 $jpg $gpx | cut -d' ' -f 2-
    echo --- show GPSInfo tags ---
    runTest                      exiv2 -pa --grep GPSInfo $jpg

Curiously, there's an unexpected wobble:

593 rmills@rmillsmm-local:~/gnu/github/exiv2/0.27-maintenance/test $ ./geotag-test.py 
-- geotag-test
-- pwd= /Users/rmills/gnu/github/exiv2/0.27-maintenance/test
-- exiv2=/Users/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2
-- exiv2 -pa --grep GPS  tmp/FurnaceCreekInn.jpg returncode = 1 <-- Wobble !
passed geotag-test
594 rmills@rmillsmm-local:~/gnu/github/exiv2/0.27-maintenance/test $ 

When the command exiv2 -pa --grep GPS tmp/FurnaceCreekInn.jpg runs on the file from which all GPS tags have been deleted, it has a returncode = 1. I'll investigate.

@clanmills
Copy link
Collaborator Author

clanmills commented May 22, 2020

The code returning 1 is in src/actions.cpp. The comments says return -3, the code does return 1. I believe the negative return was changed because we wanted to keep return codes positive 0..255 Why set none-zero when nothing bad has happened. The man page says nothing about this.

I don't think we should do anything about this as it's not a error.

    // With -g or -K, return -3 if no matching tags were found
    int rc = 0;
    if ((!Params::instance().greps_.empty() || !Params::instance().keys_.empty()) && !ret) rc = 1;

    return rc;
} // Print::printMetadata

@clanmills
Copy link
Collaborator Author

clanmills commented May 22, 2020

I've had ideas about how to achieve this in system_tests.py.

The current construction of system_tests.py is built around the idea of executing a single command and comparing it to the output string stored in the python test file. Good architecture. The test is totally self contained.

We could extend the machinery to compare the total output to the reference file in test/data/foo.out

In this way, we'd need 25 new test scripts (in tests/bash_tests), which would be something very similar to what I've written above.

I still believe the project can be done in about 1 week. And if it's done in the system_tests, the effort will live into the future.

However, as I have said, this proposal relates to Exiv2 v0.27.4 which might never happen.

@D4N. Thoughts?

@clanmills
Copy link
Collaborator Author

clanmills commented May 23, 2020

The bash script test/geotag-test.sh remains in the test suite as documented for v0.27.3. It can be executed directly in the terminal, or via make. The script becomes a one-liner! The big win is that users of cmd.exe in Windows can run those scripts without requiring msys2/bash. And the cmake command cmake --build . --config Release --target bash_tests will work without using the special PATH documented in README.md.

Using system_tests.py Using exiv2Tests.py
python3 ../tests/runner.py ../tests/bash_tests/geotag-tests.py python3 ./geotag-tests.py

This is a clear winner and fits perfectly into Dan's tests/system_tests.py. I think Dan's objections are not well grounded. This proposal moves our test apparatus forward without loss of existing infrastructure. If the project Exiv2 v0.27.4 is undertaken, I will implement this in test/exiv2Tests.py

I don't intend to contribute to Exiv2 v0.28. test/exiv2Tests.py can be ported into tests/system_tests.py and it's up to the engineers on the v0.28 project to decide what should be done.

@clanmills
Copy link
Collaborator Author

Here's another reason to go down this "pythonic bash_tests" road. Currently, there's no obvious way to test a cross-compiled build on wine. In README.md, I documented how I tested it using a shared drive. And I've speculated that we could install MinGW/msys2 in wine.

However, there's another way. We can install easily python3 in wine and run the 100% python test suite. I've downloaded a zip of python38 from python.org. It's already built and ready to go.

$ cd ~/temp/python3
$ wine cmd.exe
z:\home\rmills\temp\python3> python3

Incredible, almost. It crashes in about 1 second.

Z:\home\rmills\temp\python-3>wine: Call from 0x7b43cfbc to unimplemented function api-ms-win-core-path-l1-1-0.dll.PathCchCanonicalizeEx, aborting
wine: Unimplemented function api-ms-win-core-path-l1-1-0.dll.PathCchCanonicalizeEx called at address 0x7bc50023:0x7b43cfbc (thread 0034), starting debugger...
Unhandled exception: unimplemented function api-ms-win-core-path-l1-1-0.dll.PathCchCanonicalizeEx called in 32-bit code (0x7b43cfbc).

It called an API which isn't in wine. However this is very promising. I hope to get this working for Exiv2 v0.27.4.

@clanmills
Copy link
Collaborator Author

clanmills commented May 28, 2020

I've reported the wine/python3 issue to the python team:
https://bugs.python.org/issue40803

I've reported the wine/python3 issue to the wine team:
https://bugs.winehq.org/show_bug.cgi?id=49271

@clanmills
Copy link
Collaborator Author

The wine people have said "Fixed in Wine 4.0". I'll test my prototype (above) of geotag-test.py. Should work.

@clanmills
Copy link
Collaborator Author

Here's another, and possibly even simpler way to achieve the goal of getting those bash scripts to run on Windows. Use the python cmd2 module:

https://www.youtube.com/watch?v=pebeWrTqIIw

In the presentation, Todd said "you can run shell commands and pipe the data exactly the same on Windows, Linux - it doesn't matter". This is music to my ears. We simply modify the bash scripts to run in the cmd2 environment and we're done. Don't touch anything in system_tests.py @LeoHsiao1 Interested?

@LeoHsiao1
Copy link
Contributor

I tried using cmd2. It executes successfully on Linux:

[root@CentOS ~]# python3 test_cmd2.py 
(Cmd) help | grep py | wc -l
1

But it cannot execute bash commands directly on Windows:

(Cmd) help | grep py | wc -l
'grep' is not an internal or external command, nor is it a runnable program
Pipe process exited with code 255 before command could run

@clanmills
Copy link
Collaborator Author

Thanks for looking at this, @LeoHsiao1

Ah, yes. If grep's not there, we'll have to fake it. That's not so tough. You can define commands in cmd2. So, if you define do_grep, it'll get called. I don't have a definitive list of commands used by the test suite, however they are fairly simple utilities like cut, grep, sed, diff and checksum.

Another possibility is to get the code for those and compile them. However a lot of the code for those old unix utilities is really horrible. Ugly, ugly, code that compiles and inks lots of other horrible ugly code. However we could create simple similar commands e2_cut, e2_grep and so on. That's no so tough. However, the do_grep, do_cut strategy is probably easier as it's all written in python.

@LeoHsiao1
Copy link
Contributor

Oh, I thought CMD2 already provided these commands.
I'll keep trying to use CMD2

@clanmills
Copy link
Collaborator Author

clanmills commented Jul 19, 2020

I've had several new ideas about this. I'm writing down my thoughts. This isn't a design set in concrete. It's thoughts.

  1. What commands are run by the bash_tests?
    $ cd <exiv2>/build ; env EXIV2_ECHO=1 make tests
    Modify test/functions.source:
##
# echo commands and arguments
echoTest()
{
    local count=1
    for i in $@ ; do
        echo $((count++)):  $i
    done
    echo $@  >> $here/tmp/echo.txt      # add this line of code
    echo -----------------------
}

There are 621 programs run and all of them are built (exiv2 etc)

2 Analysis of test/*.sh and test/functions.sh using grep "|" ../test/*.sh

It's cp, mkdir, grep, sed, tr, cut, xmllint, md5sum and diff

We don't need xmllint as it is only use to "prettify" XMP being extracted from the images. However, I'm sure we can easily create something in Python to do this. I've never thought about what md5sum does. I test two files for equality. I think md5sum uses something in zlib and I know python can do that also.

3 Chapter 7 in the book (I/O in Exiv2)

When I started writing the book, I started to write a new program samples/io.cpp. This was to demonstrate the BasicIo classes. So you could say:

$ io cp http://clanmills.com/Stonehenge.jpg .

So, io was the program, and it would implement verbs such as cp, grep, sed etc. It wouldn't be full featured implementation of every option in cp or grep. It would be bare bones. It would be interesting and curious and not necessarily very useful.

When I started working on tvisitor.cpp, it has an Io class which is similar (although not as comprehensive) as BasicIo. I decide that samples/io.cpp was unnecessary and removed it from the book (there's a relic in SVN around r4950).

If I restore and develop samples/io.cpp, we can use that from cmd2. So grep in the test suite is replaced by runTest io grep. We can use do_grep in cmd2 to mask calling runTest.

@clanmills
Copy link
Collaborator Author

clanmills commented Jul 19, 2020

This looks promising. I've taken the demo code "first_app.py" https://cmd2.readthedocs.io/en/latest/examples/first_app.html#first-application and added the command md5sum.

1300 rmills@rmillsmm-local:~/temp $ md5sum first_app.py 
c2a67740f7d52a32b54c0dd76d4b6c7b  first_app.py
1301 rmills@rmillsmm-local:~/temp $ ./first_app.py 
(Cmd) md5sum first_app.py
first_app.py c2a67740f7d52a32b54c0dd76d4b6c7b
(Cmd) quit
1302 rmills@rmillsmm-local:~/temp $

A big win here is that we get away from subtle differences in the platform implementation of diff, checksum, grep and the other utilities and only have our pythonic flavour. So we escape from bash and linux utilities. We can run this on every platform (Windows, macOS, Linux, UNIX). We have to implement the command-set required by our bash scripts.

1306 rmills@rmillsmm-local:~/temp $ echo 'md5sum first_app.py'      > my.script
1310 rmills@rmillsmm-local:~/temp $ echo 'run_script my.script;quit' | ./first_app.py
first_app.py c2a67740f7d52a32b54c0dd76d4b6c7b
1311 rmills@rmillsmm-local:~/temp $ cat my.script | ./first_app.py 
first_app.py c2a67740f7d52a32b54c0dd76d4b6c7b
1312 rmills@rmillsmm-local:~/temp $ 

Here's my first_app.py:

#!/usr/bin/env python
"""A simple cmd2 application."""
import cmd2
import argparse
import hashlib


class FirstApp(cmd2.Cmd):
    """A simple cmd2 application."""
    def __init__(self):
        super().__init__()
        shortcuts = cmd2.DEFAULT_SHORTCUTS
        shortcuts.update({'&': 'speak'})
        super().__init__(shortcuts=shortcuts)

        # Make maxrepeats settable at runtime
        self.maxrepeats = 3
        self.add_settable(cmd2.Settable('maxrepeats', int, 'max repetitions for speak command'))
    
    speak_parser = argparse.ArgumentParser()
    speak_parser.add_argument('-p', '--piglatin', action='store_true', help='atinLay')
    speak_parser.add_argument('-s', '--shout', action='store_true', help='N00B EMULATION MODE')
    speak_parser.add_argument('-r', '--repeat', type=int, help='output [n] times')
    speak_parser.add_argument('words', nargs='+', help='words to say')

    @cmd2.with_argparser(speak_parser)
    def do_speak(self, args):
        """Repeats what you tell me to."""
        words = []
        for word in args.words:
            if args.piglatin:
                word = '%s%say' % (word[1:], word[0])
            if args.shout:
                word = word.upper()
            words.append(word)
        repetitions = args.repeat or 1
        for _ in range(min(repetitions, self.maxrepeats)):
            # .poutput handles newlines, and accommodates output redirection too
            self.poutput(' '.join(words))


    def do_md5sum(self, path):
        """print md5sum of path"""
        # https://stackoverflow.com/questions/3431825/generating-an-md5-checksum-of-a-file
        def md5sum(fname):
            hash_md5 = hashlib.md5()
            with open(fname, "rb") as f:
                for chunk in iter(lambda: f.read(4096), b""):
                    hash_md5.update(chunk)
            return hash_md5.hexdigest()
        print(path , md5sum(path))

if __name__ == '__main__':
    import sys
    c = FirstApp()
    sys.exit(c.cmdloop())

@clanmills
Copy link
Collaborator Author

clanmills commented Jul 19, 2020

There are a several limitations that seem significant:

1 Piping stdout

You can only pipe output to shell commands. You can't pipe to cmd2 fake commands. Rats. It's not the end of the world. That implies we have to used named files in the test/tmp directory. I would like a cmd2 fake command such as prettyXML to format XML from stdin:

$ !exiv2 -pX foo.jpg | prettyXML  

We may have to express that as:

$ exiv2 -pX foo.jpg > data/tmp/temp.xml
$ printXML data/tmp/temp.xml

Perhaps we can persuade Todd to add a decorator to cmd2 to make that elegant and invisible! even if it's implemented with temporary files.

  1. Multiple commands on one line

Without hooking the readline parser, I'm not sure you can use ; to separate multiple commands on one line! I don't think this is important. For sure geotag-test.sh does not do this.

  1. Defining functions, variables, loops, logical operators and tests

We might need to jump in and out of python to achieve this.

@clanmills
Copy link
Collaborator Author

Here's another possibility. Use xonsh. This is a bash look-alike written in python3 and runs on Windows, macOS, Linux. cygwin64, MinGW/msys2. Can't get it to work on Solaris, FreeBSD or NetBSD.

@LeoHsiao1
Copy link
Contributor

I started looking at the test cases for Exiv2 after work today.
Our goal is to change all bash scripts in the exiv2-source/test directory to Python scripts? Even converting functions.source to a Python module?
Cmd2 does seem to simplify the transformation process. But I will do it slowly and carefully.

@clanmills
Copy link
Collaborator Author

Thanks @LeoHsiao1

I don't think we should use cmd2. That would be a mistake.

You can see my original proposal and prototype. I converted geotag.sh We don't have to bother with functions.source, it's only purpose was to provide a little library of bash functions. I added that about 2012 because the test suite was just a collection of random bash scripts with no structure at all at that time!

xonsh looks promising - however we need it to work everywhere. No exceptions. I couldn't get it install on UNIX (Solaris, FreeBSD and NetBSD) - probably because the python installed at the moment is < 3.5.

By the way, the test suite runs well on Windows when you install MinGW/bash and set up the path correctly.

Before you dive in and do work, please run the bash test suite in /test and Dan's excellent /tests/system_tests.py. Think carefully before doing a lot of work - your time is valuable and I want it used effectively.

@tleonhardt
Copy link

In regards to cmd2, we've been strongly considering adding an operator to allow piping to other cmd2 commands. So far we haven't had any users really asking us for it so we haven't prioritized it.

For things like defining functions and all that, we support a slightly extended Python syntax for scripting where it is all normal Python plus the ability to call cmd2 commands from Python and get their output. This is available via the built-in run_pyscript command. Unfortunately our documentation for this is sorely lacking, but it does link to a decent example.

For tests, we have always used pytest and the cmd2 unit tests themselves are good documentation for how to do this. But cmd2 also has a built-in regression testing framework we dubbed transcript testing which is a basic "when you run this command expect this output" sort of thing which allows regexes in the output.

@clanmills
Copy link
Collaborator Author

Thank You for your very helpful input, @tleonhardt. We should also look at pytest. As you know, there's always more than one way to do everything. Dan did a great job on the test suite a couple of years and rewrote about 150 scripts in python. Battle fatigue and other priorities took over and we have this legacy of 25 scripts which require attention. I worked in April and May on Exiv2 v0.27.3 and started thinking about this. As the bash scripts work well, there was no reason to delay the release. I'm really pleased that @LeoHsiao1 is interested in this matter.

I really like Python. It's very elegant, consistent and simplistic - quite the opposite of perl. I gave a 5 minute "lighting" presentation at PyCon2010 in Atlanta, Georgia about Python and Exiv2. 5 minutes of fun. Enjoy! https://clanmills.com/2010/PyCon/RobinPyCon2010.mov

@LeoHsiao1
Copy link
Contributor

From your demo code, it looks like you've been using Python for years.
Do you mean to use pytest to manage Exiv2 test cases? I like pytest, too.

@clanmills
Copy link
Collaborator Author

@LeoHsiao1 I don't use python very often. When I use it, it always seems clean and sensible. You use:

for item in items:
   ....

Makes a lot more sense than:

for ( const structure iterator item = items.begin() ; item != (*item).end() ; item++ )
    ... 

I wrote a compiler/interpreter in 1986 for a CAD system. It was called "PL" = Programming Language. The compiler accepted .pls (source) and generated .plc (code) which was executed by plr (runtime) and the debugger was pld. It was based on the PL/0 code Chapter 5 of Wirth's Book "Data Structure + Algorithms = Programs". PL was really like Python.

I don't know anything about pytest. Don't know anything about CTest either. The world of software is now huge. More and more has to be learned every year. Management continue with the myth that every engineer can handle everything. Thank Goodness Medical Doctors don't behave like that.

The bash scripts work well and I've carefully documented in README.md how to install MinGW and setup the PATH to run them. However, it would be great to replace the bash scripts with python scripts. You can see in the discussion that I agreed with Dan that we should extend his beautiful tests/systems_tests.py code to replace scripts such as test/geotag-test.sh with tests/geotag-test.py.

@LeoHsiao1
Copy link
Contributor

This is my current understanding:

  • There are 32 bash scripts in the Exiv2-Source/test directory that need to be converted into Python scripts.
  • There are over a hundred stable Python test scripts available in the Exiv2-source/tests directory. system_tests.py does an excellent job of simplifying the writing of test cases.
    leo@Leo:/mnt/d/1/exiv2-0.27.2-Source$ find test -name "*.sh"  | wc -l
    32
    leo@Leo:/mnt/d/1/exiv2-0.27.2-Source$ find tests -name "*test*.py"  | wc -l
    160

You mentioned two ways on May 23:

Using system_tests.py Using exiv2Tests.py
python3 ../tests/runner.py ../tests/bash_tests/geotag-tests.py python3 ./geotag-tests.py

Do you mean to convert bash scripts to Python scripts that depend on system_tests.py or exiv2tests.py?

@clanmills
Copy link
Collaborator Author

I don't think we have exiv2tests.py - I think that's something invented by GitHub auto-complete!

I think there are 25 scripts:

527 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ find test -name "*.sh" | wc
      25      25     473
528 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ 

There are other shell scripts which have nothing to do with test:

./ci/install.sh
./ci/install_dependencies.sh
./ci/run.sh
./contrib/coverity.sh
./contrib/vms/setup.sh
./contrib/vms/setup_user.sh

tests/system_tests.py is very nice code. However to "pipe" the data through a filter (such as grep, sed, or cut) involves redefining the stdin/stdout handler and I think that's a little clumsy. I think you should investigate what I've put into the proposal.

What I like about system_tests.py is that the test program is complete. It has both the test and the reference output in a single package.

However, I know Dan found it to be a lot of work. For example, the old script test/bugfixes.sh became about 100 python scripts! test/bugfixes.sh was a jumble of tests that had accumulated over about 15 years. You may find that a script such as icc-test.sh becomes lot of different scripts. We shouldn't do that. We will make mistakes. If we're running the test icc-test, I think it's fine to generate the output in tests/tmp/icctest.out and compare it to the reference output in test/data/icc-test.out. It's more than fine. It's desirable. We only want to test if the behaviour changes.

Please do the comparison in python and not the utility diff which has different options on different platforms and handles line-endings differently on different platforms.

I strongly encourage you to run the existing scripts (both the "new" python script in tests/ and the "old" bash scripts in test/). Look and think before writing lots of new code. I've made a proposal and implemented a prototype for geotag-test. When you understood what already exists, you can choose to ignore my proposal and go down another road. However, I'd like to review your replacement for one test (eg geotag-test or icc-test) before you spend a lot of time on this.

I'm not very familiar with the internals of system_tests.py, however I added a feature to that for v0.27.3 and thought the code was well written and quite easy to understand. If you're stuck, Dan will probably help you.

@clanmills
Copy link
Collaborator Author

BTW, I case you think I was making up the story about writing a compiler and interpreter, here's proof. Some pages from the user manual and a typical drawing made by the CAD system. Very proud of that project. They laid me off in 1991 when the business was in trouble and I went to AGFA in Belgium to write a PostScript interpreter. And then to Adobe and 15 years in Silicon Valley. It's been an adventure. And now I'm retired and writing my book!

img_1691
img_1690
img_1689

@clanmills
Copy link
Collaborator Author

Thank You for doing this, @LeoHsiao1 I will review this today (Sunday). Where is the code? Have you submitted a PR. Or can you attach a zip to this issue.

I'm working on IPTC for the book today. I haven't worked on ISOBMFF for a couple of weeks.

@LeoHsiao1
Copy link
Contributor

The content I submit is in this directory:: https://github.com/LeoHsiao1/exiv2/tree/master/test/test_py

@clanmills
Copy link
Collaborator Author

Thanks. I'll look at that later today.

@clanmills
Copy link
Collaborator Author

clanmills commented Jul 26, 2020

Thank You @LeoHsiao1 Your code is beautiful and implements my design perfectly. I'm delighted. Buy yourself a beer!

I think it's working as follows:

528 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/test/test_py $ pytest .
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/_pytest/compat.py:333: PytestDeprecationWarning: The TerminalReporter.writer attribute is deprecated, use TerminalReporter._tw instead at your own risk.
See https://docs.pytest.org/en/latest/deprecations.html#terminalreporter-writer for more information.
  return getattr(object, name, default)
===================================================================== test session starts ======================================================================
platform darwin -- Python 3.8.3, pytest-5.4.3, py-1.9.0, pluggy-0.13.1
rootdir: /Users/rmills/gnu/github/exiv2/0.27-maintenance/test/test_py
plugins: xonsh-0.9.18
collected 2 items                                                                                                                                              

test_sample.py ..                                                                                                                                        [100%]

====================================================================== 2 passed in 0.49s =======================================================================
529 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/test/test_py $ 

Some comments:

  1. Where is the temporary output? Please don't delete that for a couple of reasons:
    a) I like the comfort of inspecting it to be sure beyond doubt that the test ran as intended!
    b) When the test fails, it's usually because exiv2(.exe) is output something something slightly different from the reference output in test/data/geotag-test.out. So, the most common way to fix that is to $ cp test/tmp/geotag-test.out test/data/geotag-test.out, run the test again and then submit the change to Mrs Git.

  2. Can you be a little more verbose when you the tests.

Have a look at what Dan did in tests/system_tests.py. There's a command-line argument python3 runner.py --verbose to get more extensive output. In test/Makefile, I sniff the environment string VERBOSE and set --verbose when system_tests is run.

  1. Move everything into tests and add your utils code into systems_tests.py I think you can use popen3 to run pytest from runner.py. I agreed with Dan that we would adopts system_tests.py going forward and one happy day we'll delete the scripts test/*.sh. The reference output has to stay in test/data for the moment.

  2. Spelling of Refererence

  3. Code layout and variable names

Your code is neat and careful. I like code to be in "columns" whenever possible because my eye can read it more quickly and with less thought.

abc    = this and that;
longer = that and this;

You've used a variable called _output and another called output in the same script. I keep variable_ for class members. I can't remember exact what you're doing here, however I'm sure you can make that look nicer and less obscure.

  1. iotest

I added a lot of HTTP stuff to that in v0.27.3 I spin up the python web-server to do some HTTP testing. The new code has to be retained. If you want to move it to another file, that's fine. By the way, it was Dan's idea to spin up the python web-server and that has worked well on every platform.

  1. Please submit a PR

When you've got this into a PR, I'll undoutably have more to say. I think you should ask Dan to review it. Dan is really smart and thinks of things that nobody else considers. You'll like him a lot. Great Engineer.

  1. Thank You

I'm really delighted by your work. Thank You Very Much.

@LeoHsiao1
Copy link
Contributor

  • Oh, I found my spelling was wrong.

  • Using pytest -v executes test cases in verbose mode.

  • As for the prompt message, my plan is: If the test case passes, only 'xx PASSED' is displayed. If the test case does not pass, the detailed stack is displayed. Just as follows:

    image

  • Should I commit the current small amount of code to PR, or should I wait until all the test cases are written? To which branch?

@clanmills
Copy link
Collaborator Author

Please submit the PR to 0.27-maintenance.

Let's do this in steps. Integrate your utils code into tests/systems_tests.py and put something in tests/runner.py to execute pytest for the user. You'll see that in "non-verbose" mode, Dan emits a string of dots as the tests proceed. In "verbose mode" there is a line of output for every test. I find verbose very useful when I add new tests because I can see that they are run. And you can run a single test with the syntax: python3 runner.py --verbose bugfixes/redmine/test_issue_841.py

If you modify runner.py, you'll have have clear sight of the --verbose switch and do not need to modify test/Makefile.

You will probably have to put your scripts into test/python_scripts that is hidden from runner.py and only contains scripts understood by pytest.

Once the PR is in place, you can deal with the tests one at a time.

At sometime in the future, you or I will open a new issue to say "Remove the old bash tests in test/*.sh. As currently documented the command:

$ cd build
$ make bash_tests    # runs the bash scripts
$ make python_tests # runs tests/runner.py

When you have finished the PR, and it has been reviewed and integrated in 0.27-maintenance, we'll have a period when we are running the real bash scripts AND your nice new scripts.

By the time of the next release, I hope the bash scripts will be history.

I don't know when or what's going to happen next with the project. Since January, there has been almost nothing submitted to 'master'. I shipped 0.27.3 in June. I have offered to ship either 0.27.4 or 0.28. I've had friendly correspondance with the guys, however they are inactive. When I decided at the end of March to do 0.27.3, I hoped they would use the C-19 lock-down to do some work on 'master'. They are three really good guys. They have a of other pressure in their life from the office, their partners and so on. Both Dan and Luis have done a lot of great work on 0.27 and 0.28. Kevin did great security work last year on 0.27.2 (and 0.27.3).

@clanmills clanmills added testing Anything related to the tests and their infrastructure and removed request feature request or any other kind of wish labels Sep 4, 2020
@clanmills
Copy link
Collaborator Author

Huge thanks to @LeoHsiao1 for successfully undertaking this project and submitting PR #1257.

There are several minor topics remaining such as removing bash from $ make version_test/unit-test #1274 and removing the environment string EXIV2_EXT from the code base #1273.

Currently the command $ make bash_tests runs both the "old" bash tests and $ make python-tests runs the python tests/system_tests.py AND the new python tests/bash_tests/test-cases.py. Let's retain this "dual test" capability for a few weeks and I will open a new issue to finally "kill off" all dependency on bash and the removal of the scripts exiv2dir/test/*.sh and exiv2dir/test/functions.so.

This issue will remain open until until all use of bash in the test suite is eliminated.

@clanmills
Copy link
Collaborator Author

Thanks to @LeoHsiao1 for undertaking the work to bring this to a successful conclusion. The great work by Dan in system_tests.py has been extended to provide python functions which emulate linux utilities such as grep, diff and cut. Thank You, Mr Lion.

@clanmills clanmills reopened this Mar 16, 2021
@clanmills clanmills modified the milestones: v0.27.4, v0.28 Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Anything related to the tests and their infrastructure
Projects
None yet
Development

No branches or pull requests

4 participants