Skip to content

Commit

Permalink
update: lab02
Browse files Browse the repository at this point in the history
  • Loading branch information
curieuxjy committed Mar 3, 2024
1 parent 5cf7bb7 commit e846e4c
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 52 deletions.
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ <h5 class="quarto-listing-category-title">Categories</h5><div class="quarto-list

<div class="quarto-listing quarto-listing-container-default" id="listing-listing">
<div class="list quarto-listing-default">
<div class="quarto-post image-right" data-index="0" data-categories="lab,quantization,linear,kmeans" data-listing-date-sort="1709650800000" data-listing-file-modified-sort="1709477179403" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="11" data-listing-word-count-sort="2182">
<div class="quarto-post image-right" data-index="0" data-categories="lab,quantization,linear,kmeans" data-listing-date-sort="1709650800000" data-listing-file-modified-sort="1709477947295" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="11" data-listing-word-count-sort="2158">
<div class="thumbnail">
<p><a href="./posts/labs/lab02.html" class="no-external"></a></p><a href="./posts/labs/lab02.html" class="no-external">
<p><img src="./images/lab02/kmeans.png" class="thumbnail-image"></p>
Expand Down
42 changes: 23 additions & 19 deletions docs/posts/labs/lab02.html
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,6 @@ <h2 id="toc-title">On this page</h2>
<li><a href="#하드웨어-지원" id="toc-하드웨어-지원" class="nav-link" data-scroll-target="#하드웨어-지원">하드웨어 지원:</a></li>
<li><a href="#요약" id="toc-요약" class="nav-link" data-scroll-target="#요약">요약:</a></li>
</ul></li>
<li><a href="#feedback" id="toc-feedback" class="nav-link" data-scroll-target="#feedback">Feedback</a></li>
</ul>
</nav>
</div>
Expand Down Expand Up @@ -294,7 +293,7 @@ <h1 class="title">👩‍💻 Lab 2</h1>
<h1><strong>Lab 2: Quantization</strong></h1>
<section id="goals" class="level2">
<h2 class="anchored" data-anchor-id="goals">Goals</h2>
<p>이 과제에서는 모델 크기와 지연 시간을 줄이기 위해 클래식한 <strong>neural network model</strong><strong>quantizing</strong>하는 연습을 할 것입니다. 이 과제의 목표는 다음과 같습니다:</p>
<p>이번 실습에서는 모델 크기와 지연 시간을 줄이기 위해 클래식한 <strong>neural network model</strong><strong>quantizing</strong>하는 연습을 할 것입니다. 이 실습의 목표는 다음과 같습니다:</p>
<ul>
<li><strong>Quantization</strong>의 기본 개념을 이해합니다.</li>
<li><strong>k-means quantization</strong>을 구현하고 적용합니다.</li>
Expand All @@ -308,7 +307,12 @@ <h2 class="anchored" data-anchor-id="goals">Goals</h2>
<section id="contents" class="level2">
<h2 class="anchored" data-anchor-id="contents">Contents</h2>
<p>주요 섹션은 <strong><em>K-Means Quantization</em></strong><strong><em>Linear Quantization</em></strong> 2가지로 구성되어 있습니다.</p>
<p>이번 실습 노트에서 총 <strong><em>10</em></strong>개의 질문을 통해 학습하게 됩니다.: - <em>K-Means Quantization</em>에 대해서는 <strong><em>3</em></strong>개의 질문이 있습니다 (질문 1-3). - <em>Linear Quantization</em>에 대해서는 <strong><em>6</em></strong>개의 질문이 있습니다 (질문 4-9). - 질문 10은 k-means quantization과 linear quantization을 비교합니다.</p>
<p>이번 실습 노트에서 총 <strong><em>10</em></strong>개의 질문을 통해 학습하게 됩니다.:</p>
<ul>
<li><em>K-Means Quantization</em>에 대해서는 <strong><em>3</em></strong>개의 질문이 있습니다 (Question 1-3).</li>
<li><em>Linear Quantization</em>에 대해서는 <strong><em>6</em></strong>개의 질문이 있습니다 (Question 4-9).</li>
<li>Question 10은 k-means quantization과 linear quantization을 비교합니다.</li>
</ul>
<blockquote class="blockquote">
<p>실습노트에 대한 설정 부분(Setup)은 Colaboratory Note를 열면 확인하실 수 있습니다. 포스팅에서는 보다 실습내용에 집중할 수 있도록 생략되어 있습니다.</p>
</blockquote>
Expand Down Expand Up @@ -346,7 +350,11 @@ <h1>K-Means Quantization</h1>
<p><strong><em>quantized_weight</em> = <em>codebook.centroids</em>[<em>codebook.labels</em>].view_as(weight)</strong></p>
</blockquote>
<p><span class="math inline">\(n\)</span>-bit k-means <strong>quantization</strong>은 시냅스를 <span class="math inline">\(2^n\)</span> 개의 클러스터로 나누고, 동일한 클러스터 내의 시냅스는 동일한 가중치 값을 공유하게 됩니다.</p>
<p>따라서, k-means <strong>quantization</strong>은 다음과 같은 codebook을 생성합니다: * <code>centroids</code>: <span class="math inline">\(2^n\)</span> fp32 클러스터 중심. * <code>labels</code>: 원래 fp32 가중치 텐서와 동일한 #elements를 가진 <span class="math inline">\(n\)</span>-bit 정수 텐서. 각 정수는 해당 클러스터가 어디에 속하는지를 나타냅니다.</p>
<p>따라서, k-means <strong>quantization</strong>은 다음과 같은 codebook을 생성합니다:</p>
<ul>
<li><code>centroids</code>: <span class="math inline">\(2^n\)</span> fp32 클러스터 중심.</li>
<li><code>labels</code>: 원래 fp32 가중치 텐서와 동일한 #elements를 가진 <span class="math inline">\(n\)</span>-bit 정수 텐서. 각 정수는 해당 클러스터가 어디에 속하는지를 나타냅니다.</li>
</ul>
<p>추론하는 동안, codebook을 기반으로 한 fp32 텐서가 추론을 위해 생성됩니다:</p>
<blockquote class="blockquote">
<p><strong><em>quantized_weight</em> = <em>codebook.centroids</em>[<em>codebook.labels</em>].view_as(weight)</strong></p>
Expand Down Expand Up @@ -1009,9 +1017,7 @@ <h2 class="anchored" data-anchor-id="quantized-inference">Quantized Inference</h
<p><span class="math inline">\(Z_{\mathrm{weight}}=0\)</span>이므로, <span class="math inline">\(r_{\mathrm{weight}} = S_{\mathrm{weight}}q_{\mathrm{weight}}\)</span>입니다.</p>
<p>부동 소수점 convolution은 다음과 같이 작성할 수 있습니다.</p>
<blockquote class="blockquote">
<p><span class="math inline">\(r_{\mathrm{output}} = \mathrm{CONV}[r_{\mathrm{input}}, r_{\mathrm{weight}}] + r_{\mathrm{bias}}\\
\;\;\;\;\;\;\;\;= \mathrm{CONV}[S_{\mathrm{input}}(q_{\mathrm{input}}-Z_{\mathrm{input}}), S_{\mathrm{weight}}q_{\mathrm{weight}}] + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\\
\;\;\;\;\;\;\;\;= \mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}]\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}}) + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\)</span></p>
<p><span class="math inline">\(r_{\mathrm{output}} = \mathrm{CONV}[r_{\mathrm{input}}, r_{\mathrm{weight}}] + r_{\mathrm{bias}}\)</span>\ <span class="math inline">\(\;\;\;\;\;\;\;\;= \mathrm{CONV}[S_{\mathrm{input}}(q_{\mathrm{input}}-Z_{\mathrm{input}}), S_{\mathrm{weight}}q_{\mathrm{weight}}] + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\)</span>\ <span class="math inline">\(\;\;\;\;\;\;\;\;= \mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}]\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}}) + S_{\mathrm{bias}}(q_{\mathrm{bias}}-Z_{\mathrm{bias}})\)</span></p>
</blockquote>
<p>계산을 더 간단하게 하기 위해</p>
<blockquote class="blockquote">
Expand All @@ -1020,7 +1026,7 @@ <h2 class="anchored" data-anchor-id="quantized-inference">Quantized Inference</h
</blockquote>
<p>로 설정하여,</p>
<blockquote class="blockquote">
<p><span class="math inline">\(r_{\mathrm{output}} = (\mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}})\)</span> <span class="math inline">\(\;\;\;\;\;\;\;\;= (\mathrm{CONV}[q_{\mathrm{input}}, q_{\mathrm{weight}}] - \mathrm{CONV}[Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}}S_{\mathrm{weight}})\)</span></p>
<p><span class="math inline">\(r_{\mathrm{output}} = (\mathrm{CONV}[q_{\mathrm{input}}-Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}} \cdot S_{\mathrm{weight}})\)</span> \ <span class="math inline">\(\;\;\;\;\;\;\;\;\;= (\mathrm{CONV}[q_{\mathrm{input}}, q_{\mathrm{weight}}] - \mathrm{CONV}[Z_{\mathrm{input}}, q_{\mathrm{weight}}] + q_{\mathrm{bias}})\cdot (S_{\mathrm{input}}S_{\mathrm{weight}})\)</span></p>
</blockquote>
<p>이며,</p>
<blockquote class="blockquote">
Expand Down Expand Up @@ -1525,12 +1531,14 @@ <h3 class="anchored" data-anchor-id="question-9.1-5-pts">Question 9.1 (5 pts)</h
<span id="cb35-4"><a href="#cb35-4" aria-hidden="true" tabindex="-1"></a> <span class="co"># hint: you need to convert the original fp32 input of range (0, 1)</span></span>
<span id="cb35-5"><a href="#cb35-5" aria-hidden="true" tabindex="-1"></a> <span class="co"># into int8 format of range (-128, 127)</span></span>
<span id="cb35-6"><a href="#cb35-6" aria-hidden="true" tabindex="-1"></a> <span class="co">############### YOUR CODE STARTS HERE ###############</span></span>
<span id="cb35-7"><a href="#cb35-7" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> x.clamp(<span class="op">-</span><span class="dv">128</span>, <span class="dv">127</span>).to(torch.int8)</span>
<span id="cb35-8"><a href="#cb35-8" aria-hidden="true" tabindex="-1"></a> <span class="co">############### YOUR CODE ENDS HERE #################</span></span>
<span id="cb35-9"><a href="#cb35-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb35-10"><a href="#cb35-10" aria-hidden="true" tabindex="-1"></a>int8_model_accuracy <span class="op">=</span> evaluate(quantized_model, dataloader[<span class="st">'test'</span>],</span>
<span id="cb35-11"><a href="#cb35-11" aria-hidden="true" tabindex="-1"></a> extra_preprocess<span class="op">=</span>[extra_preprocess])</span>
<span id="cb35-12"><a href="#cb35-12" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="ss">f"int8 model has accuracy=</span><span class="sc">{</span>int8_model_accuracy<span class="sc">:.2f}</span><span class="ss">%"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb35-7"><a href="#cb35-7" aria-hidden="true" tabindex="-1"></a> x_scaled <span class="op">=</span> x <span class="op">*</span> <span class="dv">255</span></span>
<span id="cb35-8"><a href="#cb35-8" aria-hidden="true" tabindex="-1"></a> x_shifted <span class="op">=</span> x_scaled <span class="op">-</span> <span class="dv">128</span></span>
<span id="cb35-9"><a href="#cb35-9" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> x_shifted.clamp(<span class="op">-</span><span class="dv">128</span>, <span class="dv">127</span>).to(torch.int8)</span>
<span id="cb35-10"><a href="#cb35-10" aria-hidden="true" tabindex="-1"></a> <span class="co">############### YOUR CODE ENDS HERE #################</span></span>
<span id="cb35-11"><a href="#cb35-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb35-12"><a href="#cb35-12" aria-hidden="true" tabindex="-1"></a>int8_model_accuracy <span class="op">=</span> evaluate(quantized_model, dataloader[<span class="st">'test'</span>],</span>
<span id="cb35-13"><a href="#cb35-13" aria-hidden="true" tabindex="-1"></a> extra_preprocess<span class="op">=</span>[extra_preprocess])</span>
<span id="cb35-14"><a href="#cb35-14" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="ss">f"int8 model has accuracy=</span><span class="sc">{</span>int8_model_accuracy<span class="sc">:.2f}</span><span class="ss">%"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>VGG(
(backbone): Sequential(
Expand Down Expand Up @@ -1608,13 +1616,9 @@ <h3 class="anchored" data-anchor-id="요약">요약:</h3>
<li><strong>선형 양자화</strong>는 단순성, 속도 및 광범위한 하드웨어 호환성의 균형을 제공하여, 복잡하거나 균일하지 않은 데이터 분포에 대해 동일한 수준의 정확도를 항상 달성할 수는 없지만 실시간 처리와 처리 능력이 제한된 장치에 적합합니다.</li>
</ul>
<p>응용 프로그램의 특정 요구 사항에 따라 K-means 기반 양자화와 선형 양자화 사이에서 선택해야 하며, 정확성, 처리 지연 시간 및 사용 가능한 계산 리소스의 중요성을 고려해야 합니다.</p>
</section>
</section>
<section id="feedback" class="level1">
<h1>Feedback</h1>
<p>Please fill out this <a href="https://forms.gle/ZeCH5anNPrkd5wpp7">feedback form</a> when you finished this lab. We would love to hear your thoughts or feedback on how we can improve this lab!</p>


</section>
</section>

</main> <!-- /main -->
Expand Down
Loading

0 comments on commit e846e4c

Please sign in to comment.