From aea9d47cf5858b87ebca24deca18056bbb54d0be Mon Sep 17 00:00:00 2001 From: Adrien Carpentier Date: Thu, 1 Aug 2024 19:34:37 +0200 Subject: [PATCH] docs: update README --- README.md | 47 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index eb3891d0..37e28179 100644 --- a/README.md +++ b/README.md @@ -22,26 +22,44 @@ The hydra crawler is one of the components of the architecture. It will check if ## Dependencies +### System + This project uses `libmagic`, which needs to be installed on your system, eg: +`brew install libmagic` on MacOS, or `sudo apt-get install libmagic-dev` on Linux. + +### Python + +This project uses Python 3.9. -`brew install libmagic` on MacOS, or `sudo apt-get install libmagic-dev` on linux. +Project dependencies are listed in `pyproject.toml`, while dependencies are locked in `requirements.txt` (for production only deps) and in `requirements-dev.txt` files. + +To install the exact same environment locally including the dev dependencies, use the lock file: `pip install -r requirements-dev.txt`, or `pip-sync requirements-dev.txt` with [pip-tools](https://pip-tools.readthedocs.io/en/stable/). + +To update the lock files, you can use any modern Python package manager (except Poetry) like [pip-tools](https://pip-tools.readthedocs.io/en/stable/), [PDM](https://pdm.fming.dev/) or [uv](https://uv.readthedocs.io/en/latest/), while defining `requirement.txt` and `requirement-dev.txt` as the output lock files. +With [pip-tools](https://pip-tools.readthedocs.io/en/stable/), the command is: +`pip-compile requirements.txt && pip-compile requirements-dev.txt`. ## CLI ### Create database structure -Install udata-hydra dependencies and cli. -`poetry install` +Create a Python 3.9 virtual environment and activate it: +`python3 -m venv .venv && source .venv/bin/activate` + +Install udata-hydra dependencies and cli: +`pip install -r requirements.txt && pip install -r requirements-dev.txt` +...or with `pip-sync` from [pip-tools](https://pip-tools.readthedocs.io/en/stable/): +`pip-sync requirements.txt && pip-sync requirements-dev.txt` -`poetry run udata-hydra migrate` +`python3 udata-hydra migrate` ### Load (UPSERT) latest catalog version from data.gouv.fr -`poetry run udata-hydra load-catalog` +`python3 udata-hydra load-catalog` ## Crawler -`poetry run udata-hydra-crawl` +`python3 udata-hydra-crawl` It will crawl (forever) the catalog according to config set in `udata_hydra/config.toml`, with a default config in `udata_hydra/config_default.toml`. @@ -57,11 +75,11 @@ If an URL matches one of the `EXCLUDED_PATTERNS`, it will never be checked. A job queuing system is used to process long-running tasks. Launch the worker with the following command: -`poetry run rq worker -c udata_hydra.worker` +`python3 rq worker -c udata_hydra.worker` Monitor worker status: -`poetry run rq info -c udata_hydra.worker --interval 1` +`python3 rq info -c udata_hydra.worker --interval 1` ## CSV conversion to database @@ -71,13 +89,13 @@ Converted CSV tables will be stored in the database specified via `config.DATABA To run the tests, you need to launch the database, the test database, and the Redis broker with `docker compose -f docker-compose.yml -f docker-compose.test.yml -f docker-compose.broker.yml up -d`. -Then you can run the tests with `poetry run pytest`. +Then you can run the tests with `pytest`. -To run a specific test file, you can pass the path to the file to pytest, like this: `poetry run pytest tests/test_app.py`. +To run a specific test file, you can pass the path to the file to pytest, like this: `pytest tests/test_app.py`. -To run a specific test function, you can pass the path to the file and the name of the function to pytest, like this: `poetry run pytest tests/test_app.py::test_get_latest_check`. +To run a specific test function, you can pass the path to the file and the name of the function to pytest, like this: `pytest tests/test_app.py::test_get_latest_check`. -If you would like to see print statements as they are executed, you can pass the -s flag to pytest (`poetry run pytest -s`). However, note that this can sometimes be difficult to parse. +If you would like to see print statements as they are executed, you can pass the -s flag to pytest (`pytest -s`). However, note that this can sometimes be difficult to parse. ### Tests coverage @@ -105,8 +123,9 @@ RESOURCES_ANALYSER_API_KEY = "api_key_to_change" ### Run ```bash -poetry install -poetry run adev runserver udata_hydra/app.py +python3 -m venv .venv && source .venv/bin/activate +pip install . +python3 adev runserver udata_hydra/app.py ``` ### Routes/endpoints