-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerize analytics folder #1538
Conversation
@@ -0,0 +1,113 @@ | |||
# Use the official python3 image based on Debian 11 "Bullseye". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly copied from api/Dockerfile
@@ -1,8 +1,8 @@ | |||
[tool.poetry] | |||
authors = ["widal001 <billy.daly@agile6.com>"] | |||
description = "Python package for analyzing data related to the Simpler Grants Project" | |||
name = "analytics" | |||
packages = [{include = "analytics", from = "src"}] | |||
name = "simpler-grants-gov-analytics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The poetry run pip install
line in the Dockerfile references this name
container_name: grants-analytics | ||
volumes: | ||
- .:/analytics | ||
- ~/.ssh:/home/${RUN_USER:-analytics}/.ssh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ssh key is for gh
# By default, all python/poetry commands will run inside of the docker container | ||
# if you wish to run this natively, add PY_RUN_APPROACH=local to your environment vars | ||
# You can set this by either running `export PY_RUN_APPROACH=local` in your shell or add | ||
# it to your ~/.zshrc file (and run `source ~/.zshrc`) | ||
ifeq "$(PY_RUN_APPROACH)" "local" | ||
POETRY := poetry run | ||
GITHUB := gh | ||
else | ||
POETRY := docker-compose run $(DOCKER_EXEC_ARGS) --rm $(APP_NAME) poetry run | ||
GITHUB := docker-compose run $(DOCKER_EXEC_ARGS) --rm $(APP_NAME) gh | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important to read the documentation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for calling this out! Works like a charm:
@coilysiren can we update the development.md docs to reference this as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the same text verbatim to development.md 77139c1
# Set PYTHONPATH so that the tests can find the source code. | ||
ENV PYTHONPATH="/analytics/src/:$PYTHONPATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking request, but can we look into this further? If we're installing the analytics/
package in the docker container correctly, I don't think we should need to set the PYTHONPATH
directly, see this stack overflow post for more info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, but I'm not sure how valuable that would be right now, given that large set of work on my plate. I will make a ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the ticket: #1565
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not super worried about it, mainly just want to flag it in case it resurfaces in some other error.
# Install gh CLI | ||
# docs: https://github.com/cli/cli/blob/trunk/docs/install_linux.md | ||
RUN mkdir -p -m 755 /etc/apt/keyrings && wget -qO- https://cli.github.com/packages/githubcli-archive-keyring.gpg | tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null \ | ||
&& chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \ | ||
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \ | ||
&& apt update \ | ||
&& apt install gh -y \ | ||
&& gh --version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new versus the api dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome thanks for catching and handling the need to install the GitHub CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all familiar with this code, just a few minor notes
&& apt-get install --no-install-recommends --yes \ | ||
build-essential \ | ||
libpq-dev \ | ||
postgresql \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these apt-get values might be removable like this one if the analytics code never hits postgres
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is going to use postgres
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- build-essential: a good all around package to keep
- libpq-dev: is for postgres
- PostgreSQL: same as above
So, nothing to remove
analytics/Dockerfile
Outdated
# Set the host to 0.0.0.0 to make the server available external | ||
# to the Docker container that it's running in. | ||
ENV HOST=0.0.0.0 | ||
|
||
# Run the application. | ||
CMD ["poetry", "run", "python", "-m", "src"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this running / what is the entrypoint file to this code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entrypoints are all of these commands here:
simpler-grants-gov/analytics/Makefile
Lines 51 to 90 in 5d898f9
sprint-data-export: | |
@echo "=> Exporting project data from the sprint board" | |
@echo "=====================================================" | |
$(POETRY) run analytics export gh_project_data \ | |
--owner $(ORG) \ | |
--project $(SPRINT_PROJECT) \ | |
--output-file $(SPRINT_FILE) | |
issue-data-export: | |
@echo "=> Exporting issue data from the repository" | |
@echo "=====================================================" | |
$(POETRY) run analytics export gh_issue_data \ | |
--owner $(ORG) \ | |
--repo $(REPO) \ | |
--output-file $(ISSUE_FILE) | |
gh-data-export: sprint-data-export issue-data-export | |
sprint-burndown: | |
@echo "=> Running sprint burndown report" | |
@echo "=====================================================" | |
poetry run analytics calculate sprint_burndown \ | |
--sprint-file $(SPRINT_FILE) \ | |
--issue-file $(ISSUE_FILE) \ | |
--sprint "$(SPRINT)" \ | |
--unit $(UNIT) \ | |
--$(ACTION) | |
percent-complete: | |
@echo "=> Running percent complete deliverable" | |
@echo "=====================================================" | |
poetry run analytics calculate deliverable_percent_complete \ | |
--sprint-file $(SPRINT_FILE) \ | |
--issue-file $(ISSUE_FILE) \ | |
--unit $(UNIT) \ | |
--$(ACTION) | |
sprint-reports: sprint-burndown percent-complete | |
sprint-reports-with-latest-data: gh-data-export sprint-reports |
I don't plan on specifying the entrypoint inside of this Dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the lines you commented on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Set PYTHONPATH so that the tests can find the source code. | ||
ENV PYTHONPATH="/analytics/src/:$PYTHONPATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not super worried about it, mainly just want to flag it in case it resurfaces in some other error.
# The build stage that will be used to deploy to the various environments | ||
# needs to be called `release` in order to integrate with the repo's | ||
# top-level Makefile | ||
FROM python:3-slim AS base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah its 3.12 right now
Summary
Fixes #1540
Time to review: 10 mins
Changes proposed
Changes the application to primarily be run inside of docker.
You can still run code natively via the
PY_RUN_APPROACH=local
setting, as described in the makefile. But docker is now the "default" method of running this code.If you previously authed to Github via the web browser login flow, then you will now have to change to authing to Github via a personal access token. This is because I couldn't figure out how to get the web browser login flow working, but I could get the personal access token working.
To utilize the personal access token method:
export GH_TOKEN=...
Testing