Calculates histogram of nominal or real variable grouped by nominal variables in independent variables. It ignores
null values. Histogram edges are taken from minValue
and maxValue
property of dependent variable. If not avaiable,
then these values are calculated dynamically from dependent values (this won't work in distributed mode though).
It has two modes
compute --mode intermediate
compute --mode aggregate --job-ids 1 2 3
Intermediate mode calculates histograms from a single node, while aggregate mode is used after intermediate to combine histograms from multiple jobs. Intermediate mode can be also used to calculate histograms from single node.
Run: ./build.sh
Run: captain test
Run: ./publish.sh
Run unit tests
find . -name *.pyc -delete (cd tests; docker-compose run test_suite -x --ff --capture=no)