This repo contains useful scripts I developed over the years to support student's submissions for homeworks and projects. They include tools to manage Git and GitHub repositories, Google Workspaces (drive, spreadsheets), and shell/filesystem scripts.
- Teaching scripts
- Setup
- GitHub & GitHub Classrooms
gh_classroom_collect.py
: collect repos from a GH Organizationsgh_authors_collect.py
: extract commits per authorgh_create_wiki.py
: push Wiki template to list of reposgh_member_bulk_team.py
: add/delete GH username to GH teamsgh_pr_merge.py
: bulk merge of PRsgh_pr_feedback_create.py
: create Feedback PRsgh_pr_check_merged_forced.py
: check for merged PR and forced pushesgh_pr_feedback_comment.py
: push comments to repo's PRs
- Git Tools
- Google Tools
- Some useful commands
To install all requirements:
$ sudo pip install -r requirements.txt
The libraries used are:
- GitHub REST documentation: https://docs.github.com/en/rest
- PyGithub: https://github.com/PyGithub/PyGithub
- gitpython: http://www.legendu.net/misc/blog/hands-on-GitPython/
- gh API CLI tool: https://github.com/cli/cli (manual)
These gh_xxx.py
scripts mostly use PyGithub. Scripts will require a GitHub access token that allows access the corresponding repos/organization.
gh_classroom_collect.py
: will collect all repos in a given GitHub Classroom/Organization for a given assignment (using the prefix name of the project).gh_authors_collect.py
: extract the number of commits per each author in a set of GitHub repositories. This can be used to do analysis of student contributionsgh_member_bulk_team.py
: add/delete GH username to a list of teams in an organization (e.g., to add tutors to groups so they can see student repos).gh_pr_feedback_check_merged.py
: check if a GH Classroom Feedback PRs have been (wrongly) merged in each repo.gh_pr_feedback_comment.py
: push feedback marking to repos' Feedback PRs.
Another tool that one can consider is gh API CLI tool; see the manual.
Produces a CSV file with all the repos in a given GitHub Classroom for a particular assignment, using its corresponding prefix.
For example, to get all the repos submitted for AI24 project with prefix p3-prolog
into a CSV file repos.csv
:
$ python ./gh_classroom_collect.py -t ~/.ssh/keys/gh-token-ssardina.txt RMIT-COSC1127-1125-AI24 p3-prolog repos.csv |& tee -a repos.log
Note
The CSV repo.csv
file will be used for many later tasks, including cloning the repos locally using script git_clone_submissions.py
, extracting collaborators, etc.
Produce a CSV file with how many commits each author has done per repo. For example:
$ python ./gh_authors_collect.py -t ~/.ssh/keys/github-token-ssardina.txt
--tag submission -- repos.csv authors.csv
The --tag
option restricts to tags finishing in a given tag. If no tag is given, the whole repo is parsed to the head of main
.
This script will push a template Wiki to each repo:
$ python ./gh_create_wiki.py repos.csv ./wiki-template
See wiki-template for an example of a Wiki template.
This script will add a GH username to GH teams in an organization. For example, to add Axel to all the teams except teams teachers
and headtutor
:
$ python gh_member_bulk_team.py RMIT-COSC2780-2973-IDM24 axelahmer --nteams "teachers" "headtutor"
Running the script on: 2024-05-18-00-35-27
Sat, 18 May 2024 00:35:27 INFO Getting organization RMIT-COSC2780-2973-IDM24...
Sat, 18 May 2024 00:35:27 INFO Getting GH user for axelahmer...
Teams available: ['AI NPCs', 'ASP Dads', 'Galacticos', 'gASP', 'Harry Ron and Hermoine', 'IDM Project', 'Intellect Realm', 'Inter-Dimensional Masochists (IDM)', 'Logic Nexus', 'Lorem Ipsum', 'Mister World Wide', 'Prolog nightmares again', 'sajeevan', 'Super awesome team name', 'teachers', 'TRY']
Adding user **axelahmer** to team AI NPCs
Adding user **axelahmer** to team ASP Dads
Adding user **axelahmer** to team Galacticos
Adding user **axelahmer** to team gASP
Adding user **axelahmer** to team Harry Ron and Hermoine
Adding user **axelahmer** to team IDM Project
Adding user **axelahmer** to team Intellect Realm
Adding user **axelahmer** to team Inter-Dimensional Masochists (IDM)
Adding user **axelahmer** to team Logic Nexus
Adding user **axelahmer** to team Lorem Ipsum
Adding user **axelahmer** to team Mister World Wide
Adding user **axelahmer** to team Prolog nightmares again
Adding user **axelahmer** to team sajeevan
Adding user **axelahmer** to team Super awesome team name
Adding user **axelahmer** to team TRY
This script can be used to merge PRs in a set of repos. This is useful when a PR have been opened in each student repo to implement updates in the project spec via GitHub Classroom sync feature (new in 2024).
For example, to merge PR with title Sync
from the 40th repo in repo.csv
:
$ python ./gh_pr_merge.py repos.csv -t ~/.ssh/keys/gh-token-ssardina.txt --title Sync --start 40 |& tee -a merge_pr.log
Check which repos are missing an expected Feedback PR #1 from GitHub Classroom; and create them as needed. This may be needed because sometimes GH Classroom failed to create the PRs in some repos.
For example:
$ python ./gh_pr_create.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csv a7b3d7aee55d00d55ee29b8a505d17fc8283e9f8 |& tee pr_create.log
Notice that the script needs the list of repos to consider ()repos.csv
) and the base sha to create the Feedback branch and corresponding PR. Because the repos were created by GH Classroom, the first commit should have the same sha than the original staff template.
This script will check for PRs that have been merged and
$ python ../git-hw-submissions.git/gh_pr_check_merged_forced.py -t ~/.ssh/keys/gh-token-ssardina.txt repos.csv
It will leave two CSV files pr_merged.csv
and pr_forced.csv
with the corresponding repos' ids.
This tool will push feedback comments to PRs in GH repositories. This may be useful to provide feedback and results to students after automarking. It requires:
- A CSV file with the list of all relevant repos to process (e.g., student's projects).
- A CSV file with the marking results (points, marks, comments, etc).
- A folder with the automarking reports as text files.
- A Python file defining two functions that process a row in the marking results:
check_submission
: can be used to check if the row contains a legal/successful submission. It will return whether the row/submission needs to be skipped and a string message to be posted to the PR, if any (e.g., the reason why the submission was not marked and skipped).report_feedback
: produce the actual feedback text to be posted in the PR.
Now push all feedback to their pull requests from fist row (1) to row 5:
$ python ./gh_pr_feedback_comment.py repos.csv marking.csv reports config_report_p2.py -t ~/.ssh/keys/gh-token-ssardina.txt -s 1 -e 5 |& tee -a pr_feedback_0-10.log
Use --repos ssardina juan
to restrict to the three repos, and --dry-run
to send feedback to console instead of repos:
$ python ./gh_pr_feedback_comment.py repos.csv marking.csv reports -t ~/.ssh/keys/gh-token-ssardina.txt --repos ssardina juan --dry-run
These tools use GitPython module to have Git API in Python.
git_clone_submissions.py
: will clone and update a set of repositories (provided in a CSV file) for a given submission tag.git_create_wiki.py
: will push a template Wiki to a list of GitHub repos.git_batch_commit.sh
: a shell script template to make changes to a collection of repos.
This tool clones a set of student/team repositories listed in a CSV file at a given tagged commit. The CVS file should contain the team name (under column name TEAM
) and a GIT ssh link (under column name GIT-URL
).
If a repository already exists, it will be updated automatically:
- if the tag changed to a different commit, the new commit will be pulled;
- if the repo does not have the tag anymore (the student has withdraw the submission), the local copy will be removed from disk.
At the end, the script produces a CSV file with the information of each repo successfully cloned, including commit id (commit
), time of the commit (submitted_at
), and time of the tagging (tagged_at
).
For example, to clone Project 0 at commit with tag "final
" using the database of repos repos.csv
:
$ python ./git_clone_submissions.py --file-timestamps timestamps.csv repos.csv final submissions/ &| tee clone.log
All repos will be cloned within folder submissions/
and the file timestamps.csv
will contain the timestamps and commits of each repo cloned successfully. The file will contain the date of the commit linked to the tag. If the tag is an annotated tag (and not just lightweight tag), it will also include the date tagged (otherwise they will be assumed the same); see annotated vs lightweight tags.
To just clone the last commit in the master branch, use main
as the tag.
The timezone used is defined by constant TIMEZONE
in the script (defaults to Australia/Melbourne
time zone).
This script allows to commit and push changes to a collection of repos that already exist in a folder. This is useful to make edits to students' repos after they have been created.
To access Google Workspaces resources via the Google API, one needs to enable the API access and get a proper authentication credentials. Different APIs are provided for teh different resources (drive, gmail, sheet, etc.).
For authentication to Google Workspaces one needs the the application configurations file from APIs Console. Check the Google Sheet API setup process to obtain a client_secret.json
(same ascredentials.json
) file for your project. PyDrive2 also explains how to get the file here.
All access to Google API requires authentication; usually the workflow is as follows:
- Go to the Google API Access Panel.
- Create a project.
- Create an OAuth consent screen.
- Create credentials of type "OAuth Client ID".
- Download the JSON file of such credentials and name it
client_secrets.json
- Place the file in the same directory as the scripts.
Tip
Read Integrating Google Drive API with Python: A Step-by-Step Guide.
- Google Sheet API access:
- Google Drive API access:
- PyDrive2 package for more abstract access to the .
- Google Forms API:
This script allows to download a Google Sheet from the cloud. It relies on the gsheets package to access the Google Sheet API.
The sheet to download is often a marking sheet. For example, to get the sheet MARKING
from Google Spreadsheet with id 1kX-fa3_DMNDQROUr1Y-cG89UksTUUqlYdrNcV1yN6NA
:
$ python ./gg_get_worksheet.py 1kX-fa3_DMNDQROUr1Y-cG89UksTUUqlYdrNcV1yN6NA MARKING -c ~/.ssh/keys/credentials.json -o marking.csv
The credentials.json
was obtained via the Google Sheet API setup process. The first time that is used, a permission will be required. A certificate file storage.json
will be left then that will allow direct access from there on (until certificate expires).
The authentication will be done via console. Use --webserver
to open an actual browser.
This script can process Google Sheets produced by Google Forms, and download links to uploaded files in each submission. Files will be placed in folders identifying each submission, for example with the student number or email associated to the submission.
For example, to download the files recorded in column G
in the worksheet W4-resolution
of a Google Sheet with id 1ogCRbx...
, and save each in a folder submission/XXXX
where XXXX
is the student number recorded in column C
:
$ python ./gg_sheet_submissions.py 1ogCRbxB3seVhoqhD7iBmVZ8EdpvGEB94oLowDIs5s2g W4-resolution tea --output test/submissions --file-name test.cnf --column-id C --column-file G
This tool allows downloading file submissions in a Google Drive folder, usually submitted via a Google Form by students. It uses PyDrive2 package for more abstract access to the Google Drive API.
$ python ./gg_drive_download.py 1mttf61NwuFNY25idwWw5AKzV3tQQbwlMp980i-9vnkK3PIV4o7ZOtykWvjM-VLqmHuYJ0jX4 -c ~/.ssh/keys/credentials.json --output test/submissions --file-name submissions.cnf
This script is a new and simpler version of the one in this repo.
Once all git repos have been cloned in git-submissions/
, one can build zip files from the submissions into directory zip-submissions/
as follows:
for d in git-submissions-p2/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p2/`basename "$d.zip"`" "${d}"/p2-multiagent/* ;done
or for the final CTF project:
for d in git-submissions-p4/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p4/`basename "$d.zip"`" "${d}"/pacman-contest/* ;done
To count the number of commits between dates:
git log --after="2018-03-26T00:00:00+11:00" --before="2018-03-28T00:00:00+11:00" | grep "Date:" | wc -l
To copy just the new zip files:
rsync -avt --ignore-existing zip-submissions-p4/*.zip AI18-assessments/project-4/zip-submissions/