Skip to content
This repository has been archived by the owner on Sep 10, 2020. It is now read-only.

Add Ansible automation for test database setup #72

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions fpsd/requirements/requirements.in
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,8 @@ aiohttp
aiosocks
psycopg2
SQLAlchemy

# Machine Learning
pandas
tqdm

8 changes: 7 additions & 1 deletion fpsd/requirements/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,15 @@ async-timeout==1.0.0 # via aiohttp
chardet==2.3.0 # via aiohttp
EasyProcess==0.2.3 # via pyvirtualdisplay
multidict==2.1.2 # via aiohttp
numpy==1.11.2 # via pandas
pandas==0.19.0
psycopg2==2.6.2
python-dateutil==2.5.3 # via pandas
pytz==2016.7 # via pandas
pyvirtualdisplay==0.2.1
selenium==2.53.6 # via tbselenium
SQLAlchemy==1.1.1
six==1.10.0 # via python-dateutil
SQLAlchemy==1.1.2
stem==1.4.0
tbselenium==0.1
tqdm==4.8.4
1 change: 1 addition & 0 deletions roles/crawler/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ fpsd_crawler_system_account: fpsd
# Configuration options for Postgres database.
fpsd_database_apt_packages:
- postgresql
- postgresql-contrib
- libpq-dev
- python-psycopg2

Expand Down
25 changes: 17 additions & 8 deletions roles/crawler/tasks/configure-databases.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@
template: template0
register: fpsd_database_result

- name: Setup TABLEFUNC extension.
postgresql_ext:
name: tablefunc
db: "{{ fpsd_database_psql_env.PGDATABASE }}"
register: postgres_extension
always_run: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to always run this--the state option defaults to "present," and idempotency is something to strive for in Ansible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove this line still.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

changed_when: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is unnecessary unless after you remove the run_always line the task is still showing a changed status even when re-run. Also, we need to know if it changed, because if so, we should delete the testdb and re-create it based on the new template.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


become: true
become_user: postgres

Expand All @@ -51,16 +59,17 @@
always_run: true
changed_when: false

- name: Create the raw schema.
command: psql -c 'CREATE SCHEMA raw;'
when: "'raw' not in schemas.stdout"
register: raw_schema_result
- name: Create the raw and features schemata.
command: psql -c 'CREATE SCHEMA {{ item }};'
when: "'item' not in schemas.stdout"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use {{ item }} here. Unlike with your default vars or results registered earlier in the tasklist, which are populated to the local Python task execution environment, you do actually need to use interpolation for each item when looping over an iterable in your with: directives. Otherwise, this task looks for the literal 'item' in schemas.stdout, which will always return true, and then the task will run and fail on re-provisioning because it will try to re-create the schemas, returning a non-zero exit code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed!

register: "{{ item }}_result"
with_items:
- raw
- features

- name: List all tables in the raw schema.
command: psql -c '\dt raw.*'
register: tables
always_run: true
changed_when: false

- name: "Create the tables: crawls, hs_history, frontpage_examples, & frontpage_traces."
command: psql -c '{{ lookup("file", "postgres-schemas/"+item) }}'
Expand Down Expand Up @@ -88,7 +97,7 @@
state: absent
# If the fpsd database was just created, the test database should
# not exist, so there will be nothing to delete.
when: fpsd_database_result|skipped
when: not fpsd_database_result|skipped

- name: Create the test database based on fpsd.
postgresql_db:
Expand All @@ -99,6 +108,6 @@
lc_ctype: en_US.UTF-8
template: "{{ fpsd_database_psql_env.PGDATABASE }}"

when: raw_schema_result|changed or raw_schema_tables_result|changed
when: raw_schema_tables_result|changed or postgres_extension|skipped or schemata_results|skipped
become: true
become_user: postgres