Skip to content

garciadias/Applied_data_scientist_take-home

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Applied data scientist take-home task Code

This repository contains the solution for the code test for an Applied data scientist position. The presentation of the results are found in this page.

To run the code, you need to download the data and add it to the path: data/vitamin_d_test_results_2022_cleaned.csv

I am not sharing the data here because I am not the owner, and I want to make this repository public.

To start, you need to install the Python virtual environment, here I am using conda:

conda env create -f conda.yml

Then, you need to activate the environment:

conda activate thriva

To run the code, you need to start by cleaning the data:

python thriva/clean_data.py

Before running the analysis, you can take a look at the data by using pandas profiling:

pandas_profiling --title "Clean data" ./data/vitamin_d_test_results_2022_cleaned.csv reports/report.html

Then you can run the analysis:

python thriva/task_1.py
python thriva/task_2.py

If you prefer to run the code in a Jupyter notebook, you can use the following commands:

python -m ipykernel install --user --name thriva --display-name "thriva"
jupyter-lab

This will open a Jupyter lab session in your browser. You can then open the notebooks in the notebooks folder. To run the code, you must select the thriva kernel.

To run the tests:

python -m pytest

To run the tests with coverage:

python -m pytest --cov=thriva --cov-report=html

About

Applied data scientist take-home task

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published