All our Python code is written using Python 3.7. The following instructions will set up a Python 3.7 virtual environment.
If you don't have pip3
or venv
installed (you can check with pip3 --version
):
sudo apt install python3-pip
sudo apt install python3.7-venv
Make a virtual environment and install dependencies:
make env
make install
Make sure you are in tagprobot-ml/python/
.
make run
We use mypy
for type checking, flake8
for style enforcement, and unittest
for tests. You can run them like this:
make mypy # type checker
make lint # style checker
make test
There is also a script which runs all three:
make build
It can also auto-format code first:
make build_fix
If you add a package, make sure you update requirements.txt
with:
make freeze
You can also manage your own python environment. Here is how you can create and activate a python3 virtual environment with the required dependencies:
python3.7 -m venv ENV # make virtual env
source ENV/bin/activate # enter virtual env
pip install -r requirements.txt # install dependencies
python main.py # run the code
mypy . # run type checker
flake8 # run style checker
python -m unittest # run unit tests
./build # run the build script
./build --fix # run the build script with auto-format
pip freeze > requirements.txt # add packages
- Use type hints for function and class arguments. For self-referential classes, use string literals to denote types. This is a work-around of a Python peculariarity, where you cannot reference a class until after it has been defined. For example, to write a linked-list node class, we annotate it like this:
class Node:
"""Binary tree node."""
def __init__(self, left: 'Node', right: 'Node'):
self.left = left
self.right = right
- Use reST-style docstrings for functions when the type annotations are not sufficient documentation.
- We bias towards immutability + purity. However, there are performance drawbacks to this approach in Python, so exercise best judgement. A good rule of thumb is that a function should either return a value, or perform a side effect (it should not do both). For side effecting functions, a type annotation of
-> None
indicates the function's purpose is a an effect. - See point_update_performance.py for a performance analysis of different options for handling data.
- We use dataclasses heavily to avoid writing class boilerplate. Most classes should be dataclasses. For example:
@dataclass(frozen=True)
class MyClass:
x: int
y: int
s: str
my_class = MyClass(x=1, y=2, s="foo")
x = my_class.x