These tutorials require the IJava Jupyter notebook kernel, and Java 10+.
The tutorials expect the data and required jars to be in the same directory as the notebooks. The dataset download
links are given in the tutorial, and Tribuo's jars are on Maven Central, attached to the GitHub release, or you
can build it yourself with mvn clean package
using Apache Maven.
In most cases code in them should work on Java 8 with the addition of types to replace the use of the var
keyword
added in Java 10, and replacing the collections factories introduced in Java 9, with the exception of the reproducibility
tutorial which requires Java 16+ as the reproducibility package uses newer Java features.
The tutorials cover:
- Intro classification with Irises
- Intro regression with wine-quality
- Configuration files, provenance and feature transformations on MNIST
- Clustering with K-Means
- Clustering with HDBSCAN*
- Anomaly Detection with LibSVM
- Multi-label classification with Classifier Chains
- Loading columnar data
- Document classification and extracting features from text
- Importing third-party models
- Training and deploying TensorFlow models
- ONNX export and deployment
- Model reproducibility
- Documenting Tribuo Models with Model Cards
- Feature selection for classification problems