Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds classifier chains as a generic multi-label classifier #149

Merged
merged 11 commits into from
Jul 7, 2021

Conversation

Craigacp
Copy link
Member

@Craigacp Craigacp commented Jul 4, 2021

Description

Adds ClassifierChainTrainer/Model, and CCEnsembleTrainer which train classifier chains and an ensemble of classifier chains respectively. The ensemble required a multi-label voting combiner, which allows the use of bagging and RF in multi-label problems in Tribuo in addition to supporting classifier chain ensembles.

It also adds the jaccard score as a multi-label evaluation metric as it was useful while testing out this implementation.

As part of the work there is a small performance improvement to IndependentMultiLabelTrainer so it doesn't recreate the dataset for each label, and that trainer now correctly throws an exception if it is used inside HashingTrainer as the hashing mechanism interferes with the multi-label conversion mechanism. It is future work to make the two compatible again (as well as ClassifierChainTrainer which is also incompatible with HashingTrainer).

Finally this adds two methods to MutableDataset which allow the regeneration of the feature and output domains on demand. We'll look at using this mechanism to fix issues with DatasetView which also needs the regeneration functionality later.

Motivation

Classifier chains allow the incorporation of label dependence and correlation into multi-label predictions without inducing much more overhead than the current independent prediction model used as a baseline and as implemented in the LinearSGDTrainer. This should improve performance where the labels are not independent of each other (which is probably true in most multi-label problems).

Paper reference

Classifier chains and ensembles thereof are introduced in:
Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine learning, 85(3), 333-359.
There's a review on the topic published this year, and we may investigate some of the extensions described herein if the technique proves useful.
Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2021). Classifier chains: a review and perspectives. Journal of Artificial Intelligence Research, 70, 683-718.

Copy link
Member

@pogren pogren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran code coverage analysis for the unit tests and the covereage is really good overall with 76.7% with no concerning gaps
ClassifierChainTrainer lines 222-227 are nearly identical to ClassifierChainModel lines 99-106. It looks like you could consolidate them into a single(static) method?
It seems like the method MultiLabelVotingCombiner.combine could delegate to the other combine method with equal weights.
ClassifierChainModel has just one inner trainer. Seems like it might be useful to have different hyperparameters and/or different trainers for each label.

@Craigacp
Copy link
Member Author

Craigacp commented Jul 7, 2021

I agree there's a bit of duplication there in the building of new features, but I'd still need the if statement in ClassifierChainModel to build the prediction, and so I'm not sure it would save much code (though it would mean there's only one occurrence of the ugly logic that does it).

WRT to MultiLabelVotingCombiner yeah, I could allocate a weight array, it's already doing a bunch of allocation so one more won't hurt, but I'll tidy that up later as it's the same in VotingCombiner and FullyWeightedVotingCombiner.

WRT different inner trainers, yes that might be interesting to look at. I was also considering doing bootstrap samples for the datasets which would increase variability and thus improve performance. This PR is a straight implementation of the referenced paper, we can look at doing different extensions later.

@Craigacp Craigacp merged commit 78f8b3f into main Jul 7, 2021
@Craigacp Craigacp deleted the classifier-chaining branch July 7, 2021 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants