knowledge-distillation

This repository contains experiments with Knowledge Distillation on a variety of datasets and tasks.

Knowledge Distillation is a technique that is used for generating lightweight, production-suitable models while compromising very little on the model performance.

It makes use of a unique training process:

A heavy and complex model (Teacher) is trained to an optimal degree on a task.
A lighter and simpler model (Student) is trained on a combination of hard targets (the target variable) and soft targets (the logits of the Teacher model).

Deeper details of the training, like the loss function and parameters, are described in the notebooks.

The advantages of using the Student model are that it's much quicker at inference and interestingly, preserves the dark knowledge of the data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

knowledge-distillation

Files

README.md

Latest commit

History

README.md

File metadata and controls

knowledge-distillation