update: lab02

TINYML-KOR · Mar 3, 2024 · e846e4c · e846e4c
1 parent 5cf7bb7
commit e846e4c
Show file tree

Hide file tree

Showing 4 changed files with 42 additions and 52 deletions.
diff --git a/docs/index.html b/docs/index.html
@@ -205,7 +205,7 @@ <h5 class="quarto-listing-category-title">Categories</h5><div class="quarto-list
 
 <div class="quarto-listing quarto-listing-container-default" id="listing-listing">
 <div class="list quarto-listing-default">
-<div class="quarto-post image-right" data-index="0" data-categories="lab,quantization,linear,kmeans" data-listing-date-sort="1709650800000" data-listing-file-modified-sort="1709477179403" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="11" data-listing-word-count-sort="2182">
+<div class="quarto-post image-right" data-index="0" data-categories="lab,quantization,linear,kmeans" data-listing-date-sort="1709650800000" data-listing-file-modified-sort="1709477947295" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="11" data-listing-word-count-sort="2158">
 <div class="thumbnail">
 <p><a href="./posts/labs/lab02.html" class="no-external"></a></p><a href="./posts/labs/lab02.html" class="no-external">
 <p><img src="./images/lab02/kmeans.png" class="thumbnail-image"></p>

diff --git a/docs/posts/labs/lab02.html b/docs/posts/labs/lab02.html
@@ -239,7 +239,6 @@ <h2 id="toc-title">On this page</h2>
   <li><a href="#하드웨어-지원" id="toc-하드웨어-지원" class="nav-link" data-scroll-target="#하드웨어-지원">하드웨어 지원:</a></li>
   <li><a href="#요약" id="toc-요약" class="nav-link" data-scroll-target="#요약">요약:</a></li>
   </ul></li>
-  <li><a href="#feedback" id="toc-feedback" class="nav-link" data-scroll-target="#feedback">Feedback</a></li>
   </ul>
 </nav>
     </div>
@@ -294,7 +293,7 @@ <h1 class="title">👩‍💻 Lab 2</h1>
 <h1><strong>Lab 2: Quantization</strong></h1>
 <section id="goals" class="level2">
 <h2 class="anchored" data-anchor-id="goals">Goals</h2>
-<p>이 과제에서는 모델 크기와 지연 시간을 줄이기 위해 클래식한 <strong>neural network model</strong>을 <strong>quantizing</strong>하는 연습을 할 것입니다. 이 과제의 목표는 다음과 같습니다:</p>
+<p>이번 실습에서는 모델 크기와 지연 시간을 줄이기 위해 클래식한 <strong>neural network model</strong>을 <strong>quantizing</strong>하는 연습을 할 것입니다. 이 실습의 목표는 다음과 같습니다:</p>
 <ul>
 <li><strong>Quantization</strong>의 기본 개념을 이해합니다.</li>
 <li><strong>k-means quantization</strong>을 구현하고 적용합니다.</li>
@@ -308,7 +307,12 @@ <h2 class="anchored" data-anchor-id="goals">Goals</h2>
 <section id="contents" class="level2">
 <h2 class="anchored" data-anchor-id="contents">Contents</h2>
 <p>주요 섹션은 <strong><em>K-Means Quantization</em></strong> 과 <strong><em>Linear Quantization</em></strong> 2가지로 구성되어 있습니다.</p>
-<p>이번 실습 노트에서 총 <strong><em>10</em></strong>개의 질문을 통해 학습하게 됩니다.: - <em>K-Means Quantization</em>에 대해서는 <strong><em>3</em></strong>개의 질문이 있습니다 (질문 1-3). - <em>Linear Quantization</em>에 대해서는 <strong><em>6</em></strong>개의 질문이 있습니다 (질문 4-9). - 질문 10은 k-means quantization과 linear quantization을 비교합니다.</p>
+<p>이번 실습 노트에서 총 <strong><em>10</em></strong>개의 질문을 통해 학습하게 됩니다.:</p>
+<ul>
+<li><em>K-Means Quantization</em>에 대해서는 <strong><em>3</em></strong>개의 질문이 있습니다 (Question 1-3).</li>
+<li><em>Linear Quantization</em>에 대해서는 <strong><em>6</em></strong>개의 질문이 있습니다 (Question 4-9).</li>
+<li>Question 10은 k-means quantization과 linear quantization을 비교합니다.</li>
+</ul>
 <blockquote class="blockquote">
 <p>실습노트에 대한 설정 부분(Setup)은 Colaboratory Note를 열면 확인하실 수 있습니다. 포스팅에서는 보다 실습내용에 집중할 수 있도록 생략되어 있습니다.</p>
 </blockquote>
@@ -346,7 +350,11 @@ <h1>K-Means Quantization</h1>
 <p><strong><em>quantized_weight</em> = <em>codebook.centroids</em>[<em>codebook.labels</em>].view_as(weight)</strong></p>
 </blockquote>
 <p><span class="math inline">\(n\)</span>-bit k-means <strong>quantization</strong>은 시냅스를 <span class="math inline">\(2^n\)</span> 개의 클러스터로 나누고, 동일한 클러스터 내의 시냅스는 동일한 가중치 값을 공유하게 됩니다.</p>
-<p>따라서, k-means <strong>quantization</strong>은 다음과 같은 codebook을 생성합니다: * <code>centroids</code>: <span class="math inline">\(2^n\)</span> fp32 클러스터 중심. * <code>labels</code>: 원래 fp32 가중치 텐서와 동일한 #elements를 가진 <span class="math inline">\(n\)</span>-bit 정수 텐서. 각 정수는 해당 클러스터가 어디에 속하는지를 나타냅니다.</p>
+<p>따라서, k-means <strong>quantization</strong>은 다음과 같은 codebook을 생성합니다:</p>
+<ul>
+<li><code>centroids</code>: <span class="math inline">\(2^n\)</span> fp32 클러스터 중심.</li>
+<li><code>labels</code>: 원래 fp32 가중치 텐서와 동일한 #elements를 가진 <span class="math inline">\(n\)</span>-bit 정수 텐서. 각 정수는 해당 클러스터가 어디에 속하는지를 나타냅니다.</li>
+</ul>
 <p>추론하는 동안, codebook을 기반으로 한 fp32 텐서가 추론을 위해 생성됩니다:</p>
 <blockquote class="blockquote">
 <p><strong><em>quantized_weight</em> = <em>codebook.centroids</em>[<em>codebook.labels</em>].view_as(weight)</strong></p>
@@ -1009,9 +1017,7 @@ <h2 class="anchored" data-anchor-id="quantized-inference">Quantized Inference</h
 <p><span class="math inline">\(Z_{\mathrm{weight}}=0\)</span>이므로, <span class="math inline">\(r_{\mathrm{weight}} = S_{\mathrm{weight}}q_{\mathrm{weight}}\)</span>입니다.</p>
 <p>부동 소수점 convolution은 다음과 같이 작성할 수 있습니다.</p>
 <blockquote class="blockquote">
-<p><span class="math inline">\(r_{\mathrm{output}} = \mathrm{CONV}[r_{\mathrm{input}}, r_{\mathrm{weight}}] + r_{\mathrm{bias}}\\
-\;\;\;\;\;\;\;\;= \mathrm{CONV}[S_{\mathrm{input}}(q_{\mathrm{input}}-Z_{\mathrm{input}}), S_{\mathrm{weight}}q_{\mathrm{weight}}] + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\\
-\;\;\;\;\;\;\;\;= \mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}]\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}}) + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\)</span></p>
+<p><span class="math inline">\(r_{\mathrm{output}} = \mathrm{CONV}[r_{\mathrm{input}}, r_{\mathrm{weight}}] + r_{\mathrm{bias}}\)</span>\ <span class="math inline">\(\;\;\;\;\;\;\;\;= \mathrm{CONV}[S_{\mathrm{input}}(q_{\mathrm{input}}-Z_{\mathrm{input}}), S_{\mathrm{weight}}q_{\mathrm{weight}}] + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\)</span>\ <span class="math inline">\(\;\;\;\;\;\;\;\;= \mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}]\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}}) + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\)</span></p>
 </blockquote>
 <p>계산을 더 간단하게 하기 위해</p>
 <blockquote class="blockquote">
@@ -1020,7 +1026,7 @@ <h2 class="anchored" data-anchor-id="quantized-inference">Quantized Inference</h
 </blockquote>
 <p>로 설정하여,</p>
 <blockquote class="blockquote">
-<p><span class="math inline">\(r_{\mathrm{output}} = (\mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}})\)</span> <span class="math inline">\(\;\;\;\;\;\;\;\;= (\mathrm{CONV}[q_{\mathrm{input}}, q_{\mathrm{weight}}] - \mathrm{CONV}[Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}}S_{\mathrm{weight}})\)</span></p>
+<p><span class="math inline">\(r_{\mathrm{output}} = (\mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}})\)</span> \ <span class="math inline">\(\;\;\;\;\;\;\;\;\;= (\mathrm{CONV}[q_{\mathrm{input}}, q_{\mathrm{weight}}] - \mathrm{CONV}[Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}}S_{\mathrm{weight}})\)</span></p>
 </blockquote>
 <p>이며,</p>
 <blockquote class="blockquote">
@@ -1525,12 +1531,14 @@ <h3 class="anchored" data-anchor-id="question-9.1-5-pts">Question 9.1 (5 pts)</h
 <span id="cb35-4"><a href="#cb35-4" aria-hidden="true" tabindex="-1"></a>    <span class="co"># hint: you need to convert the original fp32 input of range (0, 1)</span></span>
 <span id="cb35-5"><a href="#cb35-5" aria-hidden="true" tabindex="-1"></a>    <span class="co">#  into int8 format of range (-128, 127)</span></span>
 <span id="cb35-6"><a href="#cb35-6" aria-hidden="true" tabindex="-1"></a>    <span class="co">############### YOUR CODE STARTS HERE ###############</span></span>
-<span id="cb35-7"><a href="#cb35-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> x.clamp(<span class="op">-</span><span class="dv">128</span>, <span class="dv">127</span>).to(torch.int8)</span>
-<span id="cb35-8"><a href="#cb35-8" aria-hidden="true" tabindex="-1"></a>    <span class="co">############### YOUR CODE ENDS HERE #################</span></span>
-<span id="cb35-9"><a href="#cb35-9" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb35-10"><a href="#cb35-10" aria-hidden="true" tabindex="-1"></a>int8_model_accuracy <span class="op">=</span> evaluate(quantized_model, dataloader[<span class="st">'test'</span>],</span>
-<span id="cb35-11"><a href="#cb35-11" aria-hidden="true" tabindex="-1"></a>                               extra_preprocess<span class="op">=</span>[extra_preprocess])</span>
-<span id="cb35-12"><a href="#cb35-12" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="ss">f"int8 model has accuracy=</span><span class="sc">{</span>int8_model_accuracy<span class="sc">:.2f}</span><span class="ss">%"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<span id="cb35-7"><a href="#cb35-7" aria-hidden="true" tabindex="-1"></a>    x_scaled <span class="op">=</span> x <span class="op">*</span> <span class="dv">255</span></span>
+<span id="cb35-8"><a href="#cb35-8" aria-hidden="true" tabindex="-1"></a>    x_shifted <span class="op">=</span> x_scaled <span class="op">-</span> <span class="dv">128</span></span>
+<span id="cb35-9"><a href="#cb35-9" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> x_shifted.clamp(<span class="op">-</span><span class="dv">128</span>, <span class="dv">127</span>).to(torch.int8)</span>
+<span id="cb35-10"><a href="#cb35-10" aria-hidden="true" tabindex="-1"></a>    <span class="co">############### YOUR CODE ENDS HERE #################</span></span>
+<span id="cb35-11"><a href="#cb35-11" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb35-12"><a href="#cb35-12" aria-hidden="true" tabindex="-1"></a>int8_model_accuracy <span class="op">=</span> evaluate(quantized_model, dataloader[<span class="st">'test'</span>],</span>
+<span id="cb35-13"><a href="#cb35-13" aria-hidden="true" tabindex="-1"></a>                               extra_preprocess<span class="op">=</span>[extra_preprocess])</span>
+<span id="cb35-14"><a href="#cb35-14" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="ss">f"int8 model has accuracy=</span><span class="sc">{</span>int8_model_accuracy<span class="sc">:.2f}</span><span class="ss">%"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>VGG(
   (backbone): Sequential(
@@ -1608,13 +1616,9 @@ <h3 class="anchored" data-anchor-id="요약">요약:</h3>
 <li><strong>선형 양자화</strong>는 단순성, 속도 및 광범위한 하드웨어 호환성의 균형을 제공하여, 복잡하거나 균일하지 않은 데이터 분포에 대해 동일한 수준의 정확도를 항상 달성할 수는 없지만 실시간 처리와 처리 능력이 제한된 장치에 적합합니다.</li>
 </ul>
 <p>응용 프로그램의 특정 요구 사항에 따라 K-means 기반 양자화와 선형 양자화 사이에서 선택해야 하며, 정확성, 처리 지연 시간 및 사용 가능한 계산 리소스의 중요성을 고려해야 합니다.</p>
-</section>
-</section>
-<section id="feedback" class="level1">
-<h1>Feedback</h1>
-<p>Please fill out this <a href="https://forms.gle/ZeCH5anNPrkd5wpp7">feedback form</a> when you finished this lab. We would love to hear your thoughts or feedback on how we can improve this lab!</p>
 
 
+</section>
 </section>
 
 </main> <!-- /main -->