Carpentry workshops and instructor data extracting, analysing and mapping

This project contains several Python standalone and equivalent Jupyter Notebook scripts to extract, analyse and map the details of Carpentry workshops and instructors from The Carpentry's record keeping system AMY using AMY's API.

Python version

The recommended Python version is Python 3. Code was tested with Mac OS Sierra (10.12), Mac OS High Sierra (10.13), Mac OS Mojave (10.14) and python 3.6. Other Python versions may or may not work.

Dependencies

Dependencies for the scripts are listed in requirements.txt in the project root and can be installed via pip install -r requirements.txt

Carpentry workshops and instructor data extraction

The scripts amy_data_extract.py extracts the details of Carpentry workshops and instructors from AMY. It can be configured to extract data for a country or for all countries (which is the default option if none specified).

The extracted data is saved into 2 separate CSV files in the data/raw folder off the project root - one for instructors and one for workshops. The files are named using the date they were generated on and the country the data relates to, e.g. carpentry-workshops_GB_2017-06-26.csv, carpentry-instructors_AU_2017-06-26.csv or carpentry-instructors_ALL_2019-07-08.csv.

Setup

The script needs to authenticate to AMY so one needs to already have an account on AMY (with a proper username and password, not using AMY's authentication via GitHub).

You can configure your login details in the amy_login.yml file in project root or pass them via command line arguments (in which case the user will be prompted for password which will not be echoed - you should never pass a bare password as a command line argument nor store credentials in git).

If using a file to configure credentials, rename the existing amy_login.yml.pre config file (located in the project root) to amy_login.yml and configure your AMY username and password there accordingly. Make sure you do not share this file with the others or put it in version control as it contains sensitive information.

Alternatively, you can pass username and password as command line parameters to the script via -u USERNAME -p command line options, after which you will be prompted to enter your password in command line prompt. You can pass various other command line options to the script as well - see the section below for details.

Command line options

You can run the extractor script from the project root using the following command line options.

$ python amy_data_extract.py --help
usage: amy_data_extract.py [-h] [-c COUNTRY_CODE] [-u USERNAME]
                           [-p [PASSWORD]]
                           [-out_workshops OUTPUT_WORKSHOPS_FILE]
                           [-out_instructors OUTPUT_INSTRUCTORS_FILE]

optional arguments:
  -h, --help            show this help message and exit
  -c COUNTRY_CODE, --country_code COUNTRY_CODE
                        ISO-3166-1 two-letter country_code code or leave blank
                        for all countries
  -u USERNAME, --username USERNAME
                        Username to login to AMY
  -p [PASSWORD], --password [PASSWORD]
                        Password to log in to AMY - you will be prompted for
                        it (please do not enter your password on the command
                        line even though it is possible)
  -out_workshops OUTPUT_WORKSHOPS_FILE, --output_workshops_file OUTPUT_WORKSHOPS_FILE
                        File path where workshops data extracted from AMY will
                        be saved in CSV format. If omitted, data will be saved
                        to data/raw/ directory and will be named as
                        'carpentry_workshops_<COUNTRY_CODE>_<DATE>'.csv.
  -out_instructors OUTPUT_INSTRUCTORS_FILE, --output_instructors_file OUTPUT_INSTRUCTORS_FILE
                        File path where instructors data extracted from AMY
                        will be saved in CSV format. If omitted, data will be
                        saved to data/raw/ directory and will be named as
                        'carpentry_instructors_<COUNTRY_CODE>_<DATE>'.csv.

Carpentry workshops and instructors analyser scripts

The project contains 2 additional python scripts - analyse_workshops.py and analyse_instructors.py - to analyse the data resulting from the extraction phase.

The analyser scripts create a resulting Excel spreadsheets with various summary tables and graphs and saves them in data/analyses folders off the project root.

Command line options

There are several command line options available for analyser scripts, depending on if they are dealing with workshops or instructors. See below for details.

$ python analyse_workshops.py --help
usage: analyse_workshops.py [-h] [-in INPUT_FILE] [-out OUTPUT_FILE]

optional arguments:
  -h, --help            show this help message and exit
  -in INPUT_FILE, --input_file INPUT_FILE
                        The path to the input data CSV file to analyse/map. If
                        omitted, the latest file with workshops/instructors
                        data from data/raw/ directory off project root will be
                        used, if such exists.
  -out OUTPUT_FILE, --output_file OUTPUT_FILE
                        File path where data analyses will be saved in xslx
                        Excel format. If omitted, the Excel file will be saved
                        to data/analyses/ directory and will be named as
                        'analysed_<INPUT_FILE_NAME>'.

$ python analyse_instructors.py --help
usage: analyse_instructors.py [-h] [-in INPUT_FILE] [-out OUTPUT_FILE]

optional arguments:
  -h, --help            show this help message and exit
  -in INPUT_FILE, --input_file INPUT_FILE
                        The path to the input data CSV file to analyse/map. If
                        omitted, the latest file with workshops/instructors
                        data from data/raw/ directory off project root will be
                        used, if such exists.
  -out OUTPUT_FILE, --output_file OUTPUT_FILE
                        File path where data analyses will be saved in xslx
                        Excel format. If omitted, the Excel file will be saved
                        to data/analyses/ directory and will be named as
                        'analysed_<INPUT_FILE_NAME>'.

Running the job regulary

You can run this job regularly using the files in the cron directory. The mycrontab provides input to set up a regular cron job (on a Linux based system) to run the script RunAnalysis.sh that enacts the workflow described above.

Name		Name	Last commit message	Last commit date
Latest commit History 440 Commits
cron		cron
data		data
examples		examples
lib		lib
tests		tests
.gitignore		.gitignore
HESA_UK_research_income.ipynb		HESA_UK_research_income.ipynb
LICENSE		LICENSE
README.md		README.md
amy_login.yml.pre		amy_login.yml.pre
analyse_instructors.py		analyse_instructors.py
analyse_map_instructors.ipynb		analyse_map_instructors.ipynb
analyse_map_workshops.ipynb		analyse_map_workshops.ipynb
analyse_workshops.py		analyse_workshops.py
check_missing_coords.py		check_missing_coords.py
extract_and_process_amy.py		extract_and_process_amy.py
extract_and_process_redash.py		extract_and_process_redash.py
map_instructors.py		map_instructors.py
map_workshops.py		map_workshops.py
merge_institutional_data.py		merge_institutional_data.py
outcome_1.1.3.ipynb		outcome_1.1.3.ipynb
outcome_1.1.3.py		outcome_1.1.3.py
redash_login.yml.pre		redash_login.yml.pre
redash_query_uk_instructors.sql		redash_query_uk_instructors.sql
redash_query_uk_workshops.sql		redash_query_uk_workshops.sql
requirements.txt		requirements.txt
workshop_analyses-outcome2.1.ipynb		workshop_analyses-outcome2.1.ipynb
workshop_analyses.ipynb		workshop_analyses.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Carpentry workshops and instructor data extracting, analysing and mapping

Python version

Dependencies

Carpentry workshops and instructor data extraction

Setup

Command line options

Carpentry workshops and instructors analyser scripts

Command line options

Running the job regulary

About

Releases

Packages

Contributors 3

Languages

License

softwaresaved/carpentry-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Carpentry workshops and instructor data extracting, analysing and mapping

Python version

Dependencies

Carpentry workshops and instructor data extraction

Setup

Command line options

Carpentry workshops and instructors analyser scripts

Command line options

Running the job regulary

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages