Data discrepancy checker

This task mirrors a system we recently built internally, and will give you an idea of the problems we need to solve.

Every quarter, new company data is provided to us in PDF format. We need to use an external service to extract this data from the PDF, and then validate it against data we have on file from another source.

Complete the API so that:

A user can provide a PDF and a company name data is extracted from the PDF via the external service and compared to the data stored on file a summary of the data is returned, containing all fields from both sources, noting which fields did not match.

A selection of example PDFs have been uploaded, and the PDF extraction service has been mocked for use in src/pdf_service.py - DO NOT EDIT THIS FILE. There is simple documentation of the service in PDF_SERVICE_DOCS.md. You can treat this as just another microservice.

The existing data we have on file is available in the data/database.csv file.

Treat this code as if it will be deployed to production, following best practices where possible.

Setup using Poetry

The easiest way to set up the repository is to use python-poetry. The lock file was generated using version 1.8.3

Ensure poetry is installed
Run make install

Setup without Poetry

Alternatively it's possible to pip install directly using the pyproject.toml or requirements.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data		data
src		src
.gitignore		.gitignore
Makefile		Makefile
PDF_SERVICE_DOCS.md		PDF_SERVICE_DOCS.md
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data discrepancy checker

Setup using Poetry

Setup without Poetry

About

Releases

Packages

Languages

nurzhanizbassov/python-coding-test

Folders and files

Latest commit

History

Repository files navigation

Data discrepancy checker

Setup using Poetry

Setup without Poetry

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages