The tracking of WER is made using the following workflow:
- a dedicated user on the learning machine periodically runs training jobs (cron job, or manual runs)
- this produces, mostly, js/hyper.js containing a concatenated version of all previous runs
- util/website.py contains code that will connect to an SSH server, using SFTP
- this will publish 'index.html' and its dependencies
- Create a standard user
- Either rely on system's tensorflow or populate a virtualenv
- Using system tensorflow or a virtualenv might require setting the PYTHONPATH env variable (done for system wide tensorflow installation in the example below).
- Install PIP dependencies:
- jupyter
- BeautifulSoup4
- GitPython
- pysftp
- pyxdg
- requests
- Construct cron job:
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin/:/bin
# Run WER every 15 mins
*/15 * * * * test ! -f $HOME/.cache/deepspeech_wer/lock && (rm $HOME/.deepspeech_wer.locked.log; mkdir -p $HOME/wer && cd $HOME/wer && source /usr/local/tensorflow-env/bin/activate && /usr/bin/curl -H "Cache-Control: no-cache" -L https://raw.githubusercontent.com/mozilla/DeepSpeech/master/util/automation.py | ds_website_username="UUU" ds_website_privkey="FFF" ds_website_server_fqdn="SSS" ds_website_server_root="www/" ds_gpu_usage_root="/data/automation/gpu/" ds_dataroot="/data/" ds_wer_automation="./bin/run-wer-automation.sh" python -u ; cd) 2>$HOME/.deepspeech_wer.err.log | /usr/bin/ts "\%Y-\%m-\%d \%H:\%M:\%S" > $HOME/.deepspeech_wer.out.log || TZ='Europe/Berlin' date --rfc-2822 >> $HOME/.deepspeech_wer.locked.log
- Cron task will take care of:
- checking if there were any new merges
- perform a clone of the git repo and checkout those merges
- schedule sequential execution against those merges
- notebook is configured to automatically perform merging and to upload if the proper environment variables are configured, effectively updating the website on each iteration from the above process
- saving of the hyper.json files produced
- wiping the cloned git repo
- A 'lock' file will be created in ~/.cache/deepspeech_wer/ to ensure that we do not trigger multiple executions at the same time. Unexpected exception might leave a stale lock file
- A 'last_sha1' in the same directory will be used to keep track of what has been done last
- Previous runs' logs will be saved to ~/.local/share/deepspeech_wer/
- For debugging purpose,
~/.deepspeech_wer.err.log
and~/.deepspeech_wer.out.log
will collect stderr/stdout - Exposing the environment variables (please refer to util/website.py to have more details on each) (cron above does it):
- ds_website_username
- ds_website_privkey
- ds_website_server_fqdn
- ds_website_server_port
- ds_website_server_root
- Ensure existing webroot
- Generate an SSH key, and upload the public key to web-facing server
- Connect at least one time manually from the training machine to the web-facing server to accept the server host key and populate known_hosts file (pay attention to the FQDN)
- Make sure that server is configured with proper DirectoryIndex (Apache, or equivalent directive for others), whether system-wide or locally (with a .htaccess for example).
- Bootstrap with empty index.htm (and populate .htaccess if needed)
- That should be all. Upon any big changes with the HTML codebase, make sure to clean up the mess.