From 7183a811c7de5b363c17484ca77ffdb037341f22 Mon Sep 17 00:00:00 2001 From: Anthony Susevski Date: Tue, 6 Feb 2024 18:57:03 -0500 Subject: [PATCH] add colab button --- .../Unit 3 - Vision Transformers/KnowledgeDistillation.mdx | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/chapters/en/Unit 3 - Vision Transformers/KnowledgeDistillation.mdx b/chapters/en/Unit 3 - Vision Transformers/KnowledgeDistillation.mdx index a986a205e..aed878f1e 100644 --- a/chapters/en/Unit 3 - Vision Transformers/KnowledgeDistillation.mdx +++ b/chapters/en/Unit 3 - Vision Transformers/KnowledgeDistillation.mdx @@ -33,4 +33,9 @@ The distillation loss is formulated as: Where the KL loss refers to the Kullback-Leibler Divergence between the teacher and the student's output distributions. The overall loss for the student model is then formulated as the sum of this distillation loss with the standard cross-entropy loss over the ground-truth labels. -To see this loss function implemented in Python as well as a fully worked out example in Python, lets check out the notebook for this section, ```KnowledgeDistillation.ipynb```. +To see this loss function implemented in Python as well as a fully worked out example in Python, lets check out the notebook for this section, ```KnowledgeDistillation.ipynb``` + + + Open In Colab + +