-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: manage large resources exceptions differently #148
Merged
Merged
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
f1e8f24
feat: add create resources_exceptions table migration
bolinocroustibat 87c4780
docs: update changelog
bolinocroustibat 2b744f2
feat: create migrations
bolinocroustibat b022bea
feat: add create resource exception endpoint
bolinocroustibat d9c4191
feat: use resources_expections table instead of LARGE_RESOURCES_EXCEP…
bolinocroustibat 0190b51
feat: update migration to create resources_exceptions table
bolinocroustibat 4508648
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat 526a7bb
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat 27632c7
fix: fix types
bolinocroustibat 18edcd1
feat: resources_exceptions.table_indexes instead of resources_excepti…
bolinocroustibat 5088a51
fix: fix method to check if resource is an exception, and clean Resou…
bolinocroustibat 39b7c28
fix: fix types
bolinocroustibat fa3a963
fix: fix migration when dropping tables with foreign keys
bolinocroustibat aa7240f
fix: remove inserting data in resources_exceptions table
bolinocroustibat 9f1950c
fix: fix project metadata loading (#157)
bolinocroustibat d4fc584
fix: fix method to check if resource is an exception
bolinocroustibat f1058ce
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat bb17714
tests: fix tests
bolinocroustibat a9fd037
fix: fix loading of table_indexes column
bolinocroustibat f1461da
tests: add setup_resources_exceptions in conftest and fix test_except…
bolinocroustibat 17c2541
tests: remove useless args
bolinocroustibat 5178358
fix: fix index creation logic
bolinocroustibat 1f4e210
docs: fix doctsring
bolinocroustibat d0005d0
docs: fix docstrings
bolinocroustibat 37d64c9
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat f9c79d1
fix: fix CRUD method to insert a resource exception with dict of inde…
bolinocroustibat 0f06634
fix: fix create table with unique index
bolinocroustibat c3a81df
feat: add list of allowed indexes
bolinocroustibat 61a3af5
tests: add tests
bolinocroustibat ccf77dc
fix: fix index creation logic, use slugify
bolinocroustibat d0cb339
fix: only one type of index type for now
bolinocroustibat cae924b
tests: finish testing index creation
bolinocroustibat 75261f2
feat: add routes to get and delete resources exceptions
bolinocroustibat dcefa29
fix: fix SQL query bug due to typo
bolinocroustibat 938bb78
fix: revert cli.py which was only used for testing
bolinocroustibat a15fcef
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat d6a5a2a
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat 0281b8e
tests: fix wrong test file location
bolinocroustibat 56b1aba
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat baabec5
Merge branch 'main' into manage-resource-exceptions
bolinocroustibat c933b16
docs: better naming in SQL query
bolinocroustibat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
import hashlib | ||
import json | ||
import logging | ||
|
||
import pytest | ||
from asyncpg import Record | ||
|
||
from tests.conftest import RESOURCE_EXCEPTION_ID, RESOURCE_EXCEPTION_TABLE_INDEXES | ||
from udata_hydra import config | ||
from udata_hydra.analysis.csv import analyse_csv | ||
from udata_hydra.db.resource import Resource | ||
from udata_hydra.db.resource_exception import ResourceException | ||
from udata_hydra.utils.db import get_columns_with_indexes | ||
|
||
pytestmark = pytest.mark.asyncio | ||
|
||
|
||
log = logging.getLogger("udata-hydra") | ||
|
||
|
||
async def test_exception_analysis( | ||
setup_catalog_with_resource_exception, rmock, db, fake_check, produce_mock | ||
): | ||
""" | ||
Tests that exception resources (files that are too large to be normally processed) are indeed processed. | ||
""" | ||
# Change config to accept large files | ||
save_config = config.MAX_FILESIZE_ALLOWED | ||
config.override(MAX_FILESIZE_ALLOWED={"csv": 5000}) | ||
|
||
# Create a previous fake check for the resource | ||
check = await fake_check(resource_id=RESOURCE_EXCEPTION_ID) | ||
filename, expected_count = ("20190618-annuaire-diagnostiqueurs.csv", 45522) | ||
url = check["url"] | ||
table_name = hashlib.md5(url.encode("utf-8")).hexdigest() | ||
with open(f"tests/data/{filename}", "rb") as f: | ||
data = f.read() | ||
rmock.get(url, status=200, body=data) | ||
|
||
# Check resource status before analysis | ||
resource = await Resource.get(RESOURCE_EXCEPTION_ID) | ||
assert resource["status"] is None | ||
|
||
# Analyse the CSV | ||
await analyse_csv(check_id=check["id"]) | ||
|
||
# Check resource status after analysis | ||
resource = await Resource.get(RESOURCE_EXCEPTION_ID) | ||
assert resource["status"] is None | ||
|
||
# Check the table has been created in CSV DB, with the expected number of rows, and get the columns | ||
row: Record = await db.fetchrow(f'SELECT *, count(*) over () AS count FROM "{table_name}"') | ||
assert row["count"] == expected_count | ||
|
||
# Check if indexes have been created for the table | ||
expected_columns_with_indexes = list(RESOURCE_EXCEPTION_TABLE_INDEXES.keys()) | ||
expected_columns_with_indexes.append("__id") | ||
indexes: list[Record] | None = await get_columns_with_indexes(table_name) | ||
assert indexes | ||
for idx in indexes: | ||
assert idx["table_name"] == table_name | ||
assert idx["column_name"] in expected_columns_with_indexes | ||
|
||
# Check the profile has been saved in the tables_index | ||
profile = await db.fetchrow( | ||
"SELECT csv_detective FROM tables_index WHERE resource_id = $1", check["resource_id"] | ||
) | ||
profile = json.loads(profile["csv_detective"]) | ||
for attr in ("header", "columns", "formats", "profile"): | ||
assert profile[attr] | ||
assert profile["total_lines"] == expected_count | ||
config.override(MAX_FILESIZE_ALLOWED=save_config) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have
config
be a context manager to make sure that its value gets back to the original value even when the test fail (and thus the end of its code doesn't get executed)?Maybe make it a fixture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it could be easier indeed, good idea. I'll keep that for another PR.