diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/README.md b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/README.md new file mode 100644 index 00000000..2dee315a --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/README.md @@ -0,0 +1,69 @@ +# About + +Listed here is a collection of cheatsheet by topic. Those cheatsheets do not +explain the topics in depth, but rather serve as quick lookup documents. +Therefore, the course material provided by the lecturer should still be studied +and understood. Not everything that is tested at the mid-terms or final exams is +covered and the Author does not guarantee that the cheatsheets are free of +errors. + +* [Time and Space Complexity](./cheatsheet_time_space_complexity.pdf) +* [Asymptotic Analysis](./cheatsheet_asymptotic_analysis.pdf) +* [Time Complexity of Recursive Algorithms](.cheatsheet_time_complexity_recursive_algorithms.pdf) +* [Comparison and Non-Comparison Sorting Algorithms](./cheatsheet_sorting_algorithms.pdf) +* [Hash Tables](./cheatsheet_hash_tables.pdf) + +**NOTE**: Those cheatsheets only cover the course material **up to the midterms**. +The weeks after the midterms are not covered here. + +# Building + +_NOTE_: This step is only necessary if you chose to modify the base documents. + +The base documents are written in [AsciiDoc](https://asciidoc.org/) and can be +found in the `src/` directory. + +The following dependencies must be installed (Ubuntu): + +```console +$ apt install -y ruby-dev wkhtmltopdf +$ gem install asciidoctor +$ chmod +x build.sh +``` + +To build the documents (PDF version): + +```console +$ ./build.sh pdf +``` + +Optionally, for the HTML version: + +```console +$ ./build.sh html +``` + +and for the PNG version: + +```console +$ ./build.sh png +``` + +The generated output can be deleted with `./build.sh clean`. + +# Disclaimer + +The Presented Documents ("cheatsheets") by the Author ("Fabio Lama") are +summaries of specific topics. The term "cheatsheet" implies that the Presented +Documents are intended to be used as learning aids or as references for +practicing and does not imply that the Presented Documents should be used for +inappropriate practices during exams such as cheating or other offenses. + +The Presented Documents are heavily based on the learning material provided by +the University of London, respectively the VLeBooks Collection database in the +Online Library and the material provided on the Coursera platform. + +The Presented Documents may incorporate direct or indirect definitions, +examples, descriptions, graphs, sentences and/or other content used in those +provided materials. **At no point does the Author present the work or ideas +incorporated in the Presented Documents as their own.** diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/build.sh b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/build.sh new file mode 100755 index 00000000..d3987601 --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/build.sh @@ -0,0 +1,77 @@ +#!/bin/bash + +# Because `make` sucks. + +gen_html() { + # Remove suffix and prefix + FILE=$1 + OUT=${FILE%.adoc} + HTML_OUT="cheatsheet_${OUT}.html" + + asciidoctor $FILE -o ${HTML_OUT} +} + +# Change directory to src/ in order to have images included correctly. +cd "$(dirname "$0")/src/" + +case $1 in + html) + for FILE in *.adoc + do + # Generate HTML file. + gen_html ${FILE} + done + + # Move up from src/ + mv *.html ../ + ;; + pdf) + for FILE in *.adoc + do + # Generate HTML file. + gen_html ${FILE} + + # Convert HTML to PDF. + PDF_OUT="cheatsheet_${OUT}.pdf" + wkhtmltopdf \ + --enable-local-file-access \ + --javascript-delay 2000\ + $HTML_OUT $PDF_OUT + done + + # Move up from src/ + mv *.pdf ../ + + # Cleanup temporarily generated HTML files. + rm *.html > /dev/null 2>&1 + ;; + png | img) + for FILE in *.adoc + do + # Generate HTML file. + gen_html ${FILE} + + # Convert HTML to PNG. + IMG_OUT="cheatsheet_${OUT}.png" + wkhtmltopdf \ + --enable-local-file-access \ + --javascript-delay 2000\ + $HTML_OUT $IMG_OUT + done + + # Move up from src/ + mv *.png ../ + + # Cleanup temporarily generated HTML files. + rm *.html > /dev/null 2>&1 + ;; + clean) + rm *.html > /dev/null 2>&1 + rm *.png > /dev/null 2>&1 + rm ../*.html > /dev/null 2>&1 + rm ../*.png > /dev/null 2>&1 + ;; + *) + echo "Unrecognized command" + ;; +esac \ No newline at end of file diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_asymptotic_analysis.pdf b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_asymptotic_analysis.pdf new file mode 100644 index 00000000..726ea365 Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_asymptotic_analysis.pdf differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_hash_tables.pdf b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_hash_tables.pdf new file mode 100644 index 00000000..03778a7b Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_hash_tables.pdf differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_sorting_algorithms.pdf b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_sorting_algorithms.pdf new file mode 100644 index 00000000..f2b30804 Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_sorting_algorithms.pdf differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_time_complexity_recursive_algorithms.pdf b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_time_complexity_recursive_algorithms.pdf new file mode 100644 index 00000000..73bb6139 Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_time_complexity_recursive_algorithms.pdf differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_time_space_complexity.pdf b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_time_space_complexity.pdf new file mode 100644 index 00000000..414477f4 Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/cheatsheet_time_space_complexity.pdf differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/big_o_notation.png b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/big_o_notation.png new file mode 100644 index 00000000..ab594c5b Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/big_o_notation.png differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/growth_of_function.png b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/growth_of_function.png new file mode 100644 index 00000000..30be6b2d Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/growth_of_function.png differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/omega_notation.png b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/omega_notation.png new file mode 100644 index 00000000..7d650b44 Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/omega_notation.png differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/theta_notation.png b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/theta_notation.png new file mode 100644 index 00000000..02cdab20 Binary files /dev/null and b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/assets/theta_notation.png differ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/asymptotic_analysis.adoc b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/asymptotic_analysis.adoc new file mode 100644 index 00000000..d7bc86fd --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/asymptotic_analysis.adoc @@ -0,0 +1,129 @@ += Cheatsheet - Asymptotic Analysis +Fabio Lama +:description: Module: CM2035 Algorithms and Data Structures II, started April 2024 +:doctype: article +:sectnums: 4 +:toclevels: 4 +:stem: + +== About + +Asymptotic analysis is an alternative way of describing the time or memory +requirements of an algorithm. + +== Big O Notation + +Big O notation stem:[O(x)] defines a set of functions that act as an **upper bound** +stem:[g(N)] for stem:[T(N)]. Formally defined as: + +stem:[T(N)] is stem:[O(g(N))] if there exist positive +constants stem:[c] and stem:[n_0] such that: + +[stem] +++++ +T(N) <= c xx g(N) " for all " N > n_0 +++++ + +Note that there can be **multiple functions** stem:[g_x(N)] that act as **an upper bound** +for stem:[T(N)]. Additionally, do notice that it's **not necessary** that +stem:[c xx g(N)] is equal to or greater than stem:[T(N)] for all values of +stem:[N]. + +image::assets/big_o_notation.png[align=center, width=300] + +For example, consider: + +[stem] +++++ +T(N) = 10 N^2 + 15N + 5\ +g(N) = N^2\ +c = 1 +++++ + +Here, stem:[c xx g(N)] is never greater than stem:[T(N)], because there is no +solution for: + +[stem] +++++ +10 N^2 + 15N + 5 <= 1 xx N^2 +++++ + +However, consider: + +[stem] +++++ +c = 25 +++++ + +In case of stem:[N = 1] we get: + +[stem] +++++ +10 xx 1^2 + 15 xx 1 + 5 <= 25 xx 1^2\ += 10 + 15 + 5 <= 25\ += 30 <= 25 +++++ + +Which is false. However, for stem:[N = 2] we get: + +[stem] +++++ +10 xx 2^2 + 15 xx 2 + 5 <= 25 xx 2^2\ += 40 + 30 + 5 <= 100\ += 75 <= 100 +++++ + +Which is true. Therefore: + +[stem] +++++ +T(N) " is " O(N^2) " because"\ +T(N) <= 25 xx g(N) " for all " N >= 2 +++++ + +There choice for stem:[c] **is arbitrary**, as long as it satisfies the conditions. + +== Omega Notation + +The Omega notation stem:[Omega(x)] defines a set of functions that act as a +**lower bound** stem:[g(N)] for (stem:[T(N)]). Formally defined as: + +stem:[T(N)] is stem:[Omega(g(N))] if there exist positive constants stem:[c] and +stem:[n_0] such that: + +[stem] +++++ +T(N) >= c xx g(N) " for all " N > n_0 +++++ + +Similarly to the Big O notation, there can be **multiple functions** stem:[g_x(N)] +that act as **a lower bound** for stem:[T(N)] and it's **not necessary** that +stem:[c xx g(N)] is equal to or less than stem:[T(N)] for all values of +stem:[N], but only for the larger values. + +image::assets/omega_notation.png[align=center, width=300] + +== Theta Notation + +The Theta notation stem:[Theta(x)] defines a **single function** that acts as +both an **upper and lower bound** for (stem:[T(N)]). Formally defined as: + +stem:[T(N)] is stem:[Theta(g(N))] if there exist positive constants stem:[c_1], +stem:[c_2] and stem:[n_o] such that both those conditions hold true: + +[stem] +++++ +T(N) >= c_1 xx g(N) " for all " N > n_0\ +T(N) <= c_2 xx g(N) " for all " N > n_0 +++++ + +Alternatively: + +[stem] +++++ +c_1 xx g(N) <= T(N) <= c_2 xx g(N) " for all " N > n_0 +++++ + +image::assets/theta_notation.png[align=center, width=300] + +As already noted, Theta notation has **only one function**. diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/hash_tables.adoc b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/hash_tables.adoc new file mode 100644 index 00000000..28ea0d06 --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/hash_tables.adoc @@ -0,0 +1,91 @@ += Cheatsheet - Hash Tables +Fabio Lama +:description: Module: CM2035 Algorithms and Data Structures II, started April 2024 +:doctype: article +:sectnums: 4 +:toclevels: 4 +:stem: + +== About + +A hash table is a data structure that maps keys to values using a hash function. This function transforms the input (key) into a fixed-size integer, which serves as an index in an array, enabling fast data retrieval. Hash tables are widely used due to their average stem:[Theta(1)] time complexity for both insertion and lookup operations. + +== Search Algorithms Complexity + +WARNING: This table assumes no collisions in the hash table. + +|=== +|Name |Worst case (time) |Best case (time)| Worst case (space) |Best case (space) + +|Linear Search +|stem:[Theta(N)] +|stem:[Theta(1)] +|stem:[Theta(1)] +|stem:[Theta(1)] + +|Binary Search (iterative) +|stem:[Theta(log N)] +|stem:[Theta(1)] +|stem:[Theta(1)] +|stem:[Theta(1)] + +|Binary Search (recursive) +|stem:[Theta(log N)] +|stem:[Theta(1)] +|stem:[Theta(log N)] +|stem:[Theta(1)] + +|Direct Addressing +|stem:[Theta(1)] +|stem:[Theta(1)] +|stem:[Theta(k)] +|stem:[Theta(k)] + +|Hash Table +|stem:[Theta(1)] +|stem:[Theta(1)] +|stem:[Theta(M)] +|stem:[Theta(M)] +|=== + +Where stem:[k] is the maximum possible key value and stem:[M] is the size of the hash table. + +== Hash Tables + +Hash tables use an index of an array to represent a number. Consider an array of +size 7 and the follwing, simple hash function: + +[stem] +++++ +h(x) = x mod 7 +++++ + +For example, lets say stem:[x = 11] and we compute: + +[stem] +++++ +h(11) = 11 mod 7 = 4 +++++ + +This means the number 11 is stored in the array at index 4. Knowing the index, +the search for that number is very fast. However, given that the size of this +array is very small, **the number of collisions can be very high**, depending on +number of inputs. + +For example, both 11 and 18 would be stored at index 4: + +[stem] +++++ +h(11) = 11 mod 7 = 4\ +h(18) = 18 mod 7 = 4 +++++ + +If necessary, the hash table can be **extended** with a larger array, but this +requires rehashing all the existing elements. Alternatively, the method of +**linear probing** can be applied to use the next available index in case of a +collision (note that this information must be stored somewhere). Or, each index +can be a pointer to a separate, nested table, which is a **separate chaining** method. + +The best case and worst case complexity for hash tables must consider those +collision handling methods as well, as in how it behaves with no collisions at +all and with all elements colliding, respectively. diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/sorting_algorithms.adoc b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/sorting_algorithms.adoc new file mode 100644 index 00000000..74e3f27a --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/sorting_algorithms.adoc @@ -0,0 +1,218 @@ += Cheatsheet - Comparison and Non-Comparison Sorting Algorithms +Fabio Lama +:description: Module: CM2035 Algorithms and Data Structures II, started April 2024 +:doctype: article +:sectnums: 4 +:toclevels: 4 +:stem: + +## About + +This cheatsheet provides an overview of some common sorting algorithms. + +## Comparison Sort Overview + +|=== +|Name |Worst case complexity |Best case complexity + +|Bubble +|stem:[Theta(N^2)] +|stem:[Theta(N)] + +|Insertion +|stem:[Theta(N^2)] +|stem:[Theta(N)] + +|Selection +|stem:[Theta(N^2)] +|stem:[Theta(N^2)] + +|Quicksort +|stem:[Theta(N^2)] +|stem:[Theta(N xx log N)] + +|Mergesort +|stem:[Theta(N xx log N)] +|stem:[Theta(N xx log N)] + +|Radix Sort +|stem:[Theta(d xx N)] +|stem:[Theta(d xx N)] + +|Bucket Sort +|stem:[Theta(N^2)] +|stem:[Theta(N + N/b + b)] +|=== + +Note that the stem:[d] in Radix Sort is the number of digits and stem:[b] in +Bucket Sort is the number of buckets. Because comparison sorts must compare +pairs of elements, **they cannot** run faster than stem:[N xx log N]. + +## Note: Sorting Algorithm Visualizer + +It's recommended to lookup a sorting algorithm visualizer online, such as: +https://www.toptal.com/developers/sorting-algorithms + +## Bubble Sort + +. stem:[bb "function " "BubbleSort"(A, N)] +. stem:[" " tt "swapped" = true] +. stem:[" " bb "while " (tt "swapped") bb " do"] +. stem:[" " " " tt "swapped" = false] +. stem:[" " " " bb "for " 0 <= i < N-1 bb " do"] +. stem:[" " " " " " bb "if " (A\[i\] > A\[i + 1\]) bb " then"] +. stem:[" " " " " " " " "swap"(A\[i\], A\[i+1\])] +. stem:[" " " " " " " " tt "swapped" = true] +. stem:[" " " " " " bb "end if"] +. stem:[" " " " bb "end for"] +. stem:[" " " " N = N - 1] +. stem:[" " bb "end while"] +. stem:[" " bb "return " A] +. stem:[bb "end function"] + +### Time Complexity + +The **best case** for bubble sort is: + +[stem] +++++ +T(N) = C_0 xx N + C_1 +++++ + +Additionally: + +* stem:[T(N)] is stem:[O(N)], stem:[O(N^2)] and stem:[O(N^3)], etc. +* stem:[T(N)] is stem:[Omega(N)], stem:[Omega(log N)] and stem:[Omega(1)], etc. +* stem:[T(N)] is stem:[Theta(N)] + +The **worst case** for bubble sort is: + +[stem] +++++ +T(N) = C_0 xx N^2 + C_1 xx N + C_2. +++++ + +Additionally: + +* stem:[T(N)] is stem:[O(N^2)] and stem:[O(N^3)], etc. +* stem:[T(N)] is stem:[Omega(N^2)], stem:[Omega(log N)] and stem:[Omega(1)], etc. +* stem:[T(N)] is stem:[Theta(N^2)] + +## Insertion Sort + +. stem:[bb "function " "InsertionSort"(A, N)] +. stem:[" " bb "for " 1 <= j <= N-1 bb " do"] +. stem:[" " " " tt "ins" = A\[j\]] +. stem:[" " " " i = j-1] +. stem:[" " " " bb "while " (i >= 0 " and " tt"ins" < A\[i\]) bb " do"] +. stem:[" " " " " " A\[i+1\] = A\[i\]] +. stem:[" " " " " " i = i-1] +. stem:[" " " " bb "end while"] +. stem:[" " " " A\[i+1\] = tt "ins"] +. stem:[" " bb "end for"] +. stem:[bb "end function"] + +## Selection Sort + +. stem:[bb "function " "SelectionSort"(A, N)] +. stem:[" " bb "for " 0 <= i < N-1 bb " do"] +. stem:[" " " " tt "min" = "pos_min"(A, i, N-1)] +. stem:[" " " " tt "swap"(A\[i\], A\[tt "min"\])] +. stem:[" " bb "end for"] +. stem:[bb "end function"] + +The function stem:["pos_min"(A, a, b)] returns the position of the minimum value +between positions stem:[a] and stem:[b] (both inclusive) in array stem:[A]. + +## Quicksort + +. stem:[bb "function " "Quicksort"(A, tt "low", tt "high")] +. stem:[" " bb "if " tt "low " < tt " high" bb " then"] +. stem:[" " " " tt "p" = "partition"(A, tt "low", tt "high")] +. stem:[" " " " "Quicksort"(A, tt "low", tt "p-1")] +. stem:[" " " " "Quicksort"(A, tt "p+1", tt "high")] +. stem:[" " bb "end if"] +. stem:[bb "end function"] + +. stem:[bb "function " "partition"(A, tt "low", tt "high")] +. stem:[" " tt "pivot" = A\[tt "high"\]] +. stem:[" " tt "i" = tt "low" - 1] +. stem:[" " bb "for " tt "j" = tt "low" bb " to " tt "high" - 1 bb " do"] +. stem:[" " " " bb "if " A\[tt "j"\] < tt "pivot" bb " then"] +. stem:[" " " " tt "i" = tt "i" + 1] +. stem:[" " " " tt "swap"(A\[tt "i"\], A\[tt "j"\])] +. stem:[" " bb "end for"] +. stem:[" " tt "swap"(A\[tt "i" + 1\], A\[tt "high"\])] +. stem:[" " bb "return " tt "i" + 1] +. stem:[bb "end function"] + +### Explanation + +The function stem:[bb "partition"(A, tt "low", tt "high")] selects a pivot element (usually the last element in the current segment of the array). It then rearranges the elements in the array such that all elements less than the pivot are moved to the left of the pivot and all elements greater than or equal to the pivot are moved to the right. The pivot is then placed in its correct position, and the index of the pivot is returned. + +## Mergesort + +. stem:[bb "function " "MergeSort"(A, tt "low", tt "high")] +. stem:[" " bb "if " (tt "low" < tt "high")] +. stem:[" " " " tt "mid" = tt "low" + "floor"((tt "high" -1) -: 2)] +. stem:[" " " " "MergeSort"(A, tt "low", tt "mid")] +. stem:[" " " " "MergeSort"(A, tt "mid" + tt "1ow", tt "high")] +. stem:[" " " " "Merge"(A, tt "low", tt "mid", tt "high")] +. stem:[" " bb "end if"] +. stem:[bb "end function"] + +The function stem:["Merge"] creates two arrays of both halves (left and right) +and then merges them to produce a single, sorted array. + +## Radix Sort + +. stem:[bb "function " "RadixSort"(A, N)] +. stem:[" " tt "max" = "findMax"(A, N)] +. stem:[" " tt "exp" = 1] +. stem:[" " bb "while " tt "max" -: tt "exp" > 0 bb " do"] +. stem:[" " " " "CountSort"(A, N, tt "exp")] +. stem:[" " " " tt "exp" = tt "exp" xx 10] +. stem:[" " bb "end while"] +. stem:[bb "end function"] + +. stem:[bb "function " "CountSort"(A, N, exp)] +. stem:[" " tt "output" = "new array of size " N] +. stem:[" " tt "count" = "new array of size " 10 " initialized to 0"] +. stem:[" " bb "for " 0 <= i < N bb " do"] +. stem:[" " " " tt "index" = (A\[i\] -: tt "exp") % 10] +. stem:[" " " " tt "count"\[tt "index"\] = tt "count"\[tt "index"\] + 1] +. stem:[" " bb "end for"] +. stem:[" " bb "for " 1 <= i < 10 bb " do"] +. stem:[" " " " tt "count"\[i\] = tt "count"\[i\] + tt "count"\[i - 1\]] +. stem:[" " bb "end for"] +. stem:[" " tt "i" = N - 1] +. stem:[" " bb "while " i >= 0 bb " do"] +. stem:[" " " " tt "index" = (A\[i\] -: tt "exp") % 10] +. stem:[" " " " tt "output"\[tt "count"\[tt "index"\] - 1\] = A\[i\]] +. stem:[" " " " tt "count"\[tt "index"\] = tt "count"\[tt "index"\] - 1] +. stem:[" " " " i = i - 1] +. stem:[" " bb "end while"] +. stem:[" " bb "for " 0 <= i < N bb " do"] +. stem:[" " " " A\[i\] = tt "output"\[i\]] +. stem:[" " bb "end for"] +. stem:[bb "end function"] + +## Bucket Sort + +. stem:[bb "function " "BucketSort"(A, N, tt "max")] +. stem:[" " tt "buckets" = "new array of empty lists of size " N] +. stem:[" " bb "for " i = 0 bb " to " N - 1 bb " do"] +. stem:[" " " " tt "index" = "floor"(A\[i\] -: (tt "max" + 1) xx N)] +. stem:[" " " " "append" (tt "buckets"\[tt "index"\], A\[i\])] +. stem:[" " bb "end for"] +. stem:[" " bb "for " i = 0 bb " to " N - 1 bb " do"] +. stem:[" " " " "InsertionSort"(tt "buckets"\[i\])] +. stem:[" " bb "end for"] +. stem:[" " tt "k" = 0] +. stem:[" " bb "for " i = 0 bb " to " N - 1 bb " do"] +. stem:[" " " " bb "for " j = 0 bb " to " "len"(tt "buckets"\[i\]) - 1 bb " do"] +. stem:[" " " " " " A\[tt "k"\] = tt "buckets"\[i\]\[j\]] +. stem:[" " " " " " tt "k" = tt "k" + 1] +. stem:[" " " " bb "end for"] +. stem:[" " bb "end for"] +. stem:[bb "end function"] diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/time_complexity_recursive_algorithms.adoc b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/time_complexity_recursive_algorithms.adoc new file mode 100644 index 00000000..8b69bd86 --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/time_complexity_recursive_algorithms.adoc @@ -0,0 +1,230 @@ += Cheatsheet - Time Complexity of Recursive Algorithms +Fabio Lama +:description: Module: CM2035 Algorithms and Data Structures II, started April 2024 +:doctype: article +:sectnums: 4 +:toclevels: 4 +:stem: + +== About + +This cheatsheet covers the time complexity of recursive algorithms, including +how to apply the master theorem. + +== Recurrence Relation + +Consider: + +. stem:[bb "function " "fact"(N)] +. stem:[" " bb "if " (N "==" 1)] +. stem:[" " " " bb "return " 1] +. stem:[" " bb "return " N xx "fact"(N - 1)] + +If we analyze this function one by one: + +. stem:[" " bb "if " (N "==" 1)] + +Constant time stem:[C_0] (grouped). + +. stem:[" " bb "return " N xx "fact"(N - 1)] + +Only one of the two return statements is executed (we assume the second +statement, worst case). + +. stem:[" " " " bb "return " 1] +. stem:[" " bb "return " N xx "fact"(N - 1)] + +Constant time stem:[C_1] (return), stem:[C_2] (read N), stem:[C_3] +(multiplication), and the stem:["fact"] function has a time of +stem:[T(N-1)]. If we group all the constants together, we have: + +[stem] +++++ +C_1 + C_2 + C_3 = C_4 +++++ + +So we say that the return statement has a time complexity of: + +[stem] +++++ +C_4 + T(N-1) +++++ + +Finally, we determine that the function stem:["fact"] has a time complexity of: + +[stem] +++++ +T(N) = C_0 + C_4 + T(N-1) +++++ + +Or grouped together: + +[stem] +++++ +T(N) = C_5 + T(N-1) +++++ + +This expression is known as a **recurrence relation**. + + +== Solving a Recurrence Relation + +We can demonstrate the solution of a recurrence relation by expanding the calculation: + +[stem] +++++ +T(N) = C_5 + T(N-1)\ +T(N) = C_5 + T(C_5 + T(N-2))\ +T(N) = C_5 + T(C_5 + T(C_5 + T(N-3)))\ +...\ +T(N) = k xx C_5 + T(N-k) +++++ + +If stem:[k] is stem:[N-1], then: + +[stem] +++++ +T(N) = (N-1) xx C_5 + T(N-(N-1))\ +T(N) = (N-1) xx C_5 + C +++++ + +That's because: + +[stem] +++++ +T(N-(N-1))\ += T(N - N + 1)\ += T(1)\ += C +++++ + +We have hence solved the recurrence equation. In summary: + +* Find its recurrence equation. +* Solve the recurrence equation. +* Asymptotic analysis. + +== The Master Theorem + +The master theorem makes the asymptotic analysis of recursive functions much +easier. However, it can only be applied if the recurrence equation has this +specific structure: + +[stem] +++++ +T(n) = a xx T(n/b)+f(n) +++++ + +where stem:[a >= 1] and stem:[b > 1]. + +For example, this is solvable: + +[stem] +++++ +T(n) = T(n/2) + n\ += 1 xx T(n/2) + n +++++ + +Meanwhile, this is not: + +[stem] +++++ +T(n) = 2 xx T(n) + n\ += 2 xx T(n/1) + n +++++ + +The master theorem classifies the recurrence equiation in one of three cases: + +### Case 1 + +[stem] +++++ +f(n) < n^(log_b a) +++++ + +where the running time is: + +[stem] +++++ +T(n) = Theta(n^(log_b a)) +++++ + +### Case 2 + +[stem] +++++ +f(n) = n^(log_b a) +++++ + +where the running time is: + +[stem] +++++ +T(n) = Theta(n^(log_b a) xx log N) +++++ + +### Case 3 + +[stem] +++++ +f(n) > n^(log_b a) +++++ + +and: + +[stem] +++++ +a xx f(n/b) <= c xx f(n) " where " c < 1 " and " n " large" +++++ + +where the running time is: + +[stem] +++++ +T(n) = Theta(f(n)) +++++ + +### Example + +Consider: + +[stem] +++++ +T(n) = 2T(n/2) + n +++++ + +we deterime that the master theorem can be applied here, given that stem:[a = 2 >= 1] +and stem:[b = 2 > 1]. + +To determine the case, we first calculate: + +[stem] +++++ +log_b a = log_2 2 = 1 +++++ + +For case 1, we have: + +[stem] +++++ +n < n^1\ +n < n +++++ + +which is **false**. + +For case 2, we have: + +[stem] +++++ +n = n^1\ +n = n +++++ + +which is **true**. Hence, we classify the formula as case 2 and the running time +is: + +[stem] +++++ +T(n) = Theta(n xx log N) +++++ diff --git a/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/time_space_complexity.adoc b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/time_space_complexity.adoc new file mode 100644 index 00000000..e26df222 --- /dev/null +++ b/level-5/algorithms-and-data-structures-ii/student-notes/fabio-lama/src/time_space_complexity.adoc @@ -0,0 +1,299 @@ += Cheatsheet - Time and Space Complexity +Fabio Lama +:description: Module: CM2035 Algorithms and Data Structures II, started April 2024 +:doctype: article +:sectnums: 4 +:toclevels: 4 +:stem: + +== About + +To analyze an algorithm, we must determine its processing and memory +requirements. The processing requirement is the time complexity, and the memory +requirement is the space complexity. + +== Counting Up Time and Space Units + +=== Simple Algorithm + +Consider: + +. stem:[bb "function " "F1"(a, b, c)] +. stem:[" " tt "max" = a] +. stem:[" " bb "if " (b> tt "max")] +. stem:[" " " " tt "max" = b] +. stem:[" " bb "if " (c> tt "max")] +. stem:[" " " " tt "max" = c] +. stem:[" " bb "return " tt "max"] + +If we analyze this function one by one: + +==== Step 1. + +. stem:[" " tt "max" = a] + +1 memory read (a), 1 memory write (max), or **2 time units**. + +==== Step 2. + +. stem:[" " bb "if " (b> tt "max")] + +2 memory reads (b, max), 1 comparison, 1 evaluation, or **4 time units**. + +==== Step 3. + +. stem:[" " " " tt "max" = b] + +1 memory read (b), 1 memory write (max), or **2 time units**. + +==== Step 4. + +. stem:[" " bb "if " (c> tt "max")] + +2 memory reads (c, max), 1 comparison, 1 evaluation, or **4 time units**. + +==== Step 5. + +. stem:[" " " " tt "max" = c] + +1 memory read (c), 1 memory write (max), or **2 time units**. + +==== Step 6. + +. stem:[" " bb "return " tt "max"] + +1 memory read (max), 1 return, or **2 time units**. + +==== Total + +In **total**, this function has a time complexity of **16 time units**. The only variable created is stem:["max"], so the space complexity is **1 space unit**. + +=== Complex Algorithm + +Consider: + +. stem:[bb "function " "F2"(A, N, x)] +. stem:[" " bb "for " 0 <= i < N] +. stem:[" " " " bb "if " (A\[i\] " == " x)] +. stem:[" " " " " " bb "return " i] +. stem:[" " bb "return " -1] + +A loop is not a single instruction, meaning we need to analyze the inner +instructions carefully. If we analyze this function one by one: + +==== Step 1. + +. stem:[" " bb "for " 0 <= i < N] + +We expand this to: + +. stem:[i = 0] +. stem:[bb "if " (i < N)] +. stem:[" " tt ""] +. stem:[" " i = i + 1] + +Respectively: + +. stem:[i = 0] + +1 memory write (i), or **1 time unit**. + +. stem:[bb "if " (i < N)] + +2 memory reads (i, N), 1 comparison, 1 evaluation, or 4 time units. But since +this is in a loop, we can consider this as **4*(N+1) time units**. Plus one +because it's executed at least once, even if N is 0 (and the if-statement is +skipped). + +. stem:[" " i = i + 1] + +1 memory read (i), 1 numerical op (+1), 1 memory write (i), or **3*N time units**. +Times N because it's executed in a loop. + +For the full time complexity, excluding the inner instructions, we get: + +[stem] +++++ +1 + 4 (N + 1) + 3 N\ += 1 + 4N + 4 + 3 N\ += 7N + 5 +++++ + +In other words, this for loop takes **7N + 5 time units**, excluding the inner instructions. + +==== Step 2. + +. stem:[" " " " bb "if " (A\[i\] " == " x)] + +3 memory reads (x, i, stem:[A\[i\]]), 1 comparison, 1 evaluation, or **5*N time units**. +Times N because it's executed in a loop. + +==== Step 3. + +Only one of the two return statements is executed. + +. stem:[" " " " " " bb "return " i] +. stem:[" " bb "return " -1] + +We are going to assume the case where the number is not in the array (worst +case), so the second return statement is executed which is **1 time unit**. + +=== Total + +To summarize all the steps: + +[stem] +++++ +7N + 5\ ++ 5N\ ++ 1\ += 12N + 6 +++++ + +In **total**, this function has a time complexity of **12*N + 6 time units**. +This means it's running time depends on the size of the input array. The bigger +the array, the longer the runtime. Additionally, we only create one new variable +(i), so the space complexity is **1 space unit**. + +=== Growth of Function + +As we have seen, the simple algorithm has a time complexity of **16 time units**. +This means that the time complexity is a constant (stem:[C_x]), +regardless of the size of the input. + +We can specify this as: + +[stem] +++++ +T(N) = C_1 +++++ + +In comparison, the complex algorithm has a time complexity of **12*N + 6 time units**: + +[stem] +++++ +T(N) = C_1 N + C_2 +++++ + +This means that the time complexity is linearly dependent on the size of the input. Hypothetically, if we had a time complexity of **N^2**, we would have a +quadratic time complexity, and so on. + +Common time complexities are: + +.Source: https://www.codeproject.com/articles/1012294/algorithm-time-complexity-what-is-it +image::assets/growth_of_function.png[align=center, width=400] + +=== Growth of Function Without Counting + +We can calculate the growth of the running time of an algorithm without counting +every single time unit. + +Consider: + +. stem:[bb "function " "SumDiag"(A)] +. stem:[" " tt "sum" = 0] +. stem:[" " N = "length"(A\[0\])] +. stem:[" " bb "for " (0 <= i < N)] +. stem:[" " " " tt "sum" = tt "sum" + A\[i, i\]] +. stem:[" " bb "return " tt "sum"] + +If we analyze this function one by one: + +==== Step 1. + +. stem:[" " tt "sum" = 0] + +Constant time stem:[C_0] + +==== Step 2. + +. stem:[" " N = "length"(A\[0\])] + +Here we have to analyze the stem:["length"] function, but let's say we already +knew its time complexity of stem:[T(N) = C_1 N + C_2] + +==== Step 3. + +. stem:[" " bb "for " (0 <= i < N)] + +As we analyzed in the section on counting time units of a for loop, for example +stem:[7N + 5] (excluding inner instructions), we hence know that stem:[T(N) = C_3 N + C_4]. + +==== Step 4. + +. stem:[" " " " tt "sum" = tt "sum" + A\[i, i\]] + +Constant time stem:[C_5] times N because it's executed in a loop. Respectively: +stem:[C_5 N]. + +==== Step 5. + +. stem:[" " bb "return " tt "sum"] + +Constant time stem:[C_6]. + +==== Total + +To summarize all the steps: + +[stem] +++++ +C_0\ ++ C_1 N + C_2\ ++ C_3 N + C_4\ ++ C_5 N\ ++ C_6\ += T(N) = (C_1 + C_3 + C_5) N + (C_0 + C_2 + C_4 + C_6) +++++ + +To simplify, we can group the constants together: + +[stem] +++++ +T(N) = C_7 N + C_8 +++++ + +This means that the algorithm grows linearly with the size of the input. + +=== Worst and Best Cases + +An algorithm can have different time complexities depending on the input. +Generally, we're interested in the worst-case scenario, but we can also analyze +the best-case scenario. + +Consider: + +. stem:[bb "function " "L_Search"(A, x)] +. stem:[" " N = "length"(A)] +. stem:[" " bb "for " 0 <= i < N] +. stem:[" " " " bb "if " (A\[i\] " == " x)] +. stem:[" " " " " " bb "return " i] +. stem:[" " bb "return " -1] + +And a given array: + +[stem] +++++ +A = (13, 8, 2, 24, 5, 17, 6, 9) +++++ + +The **best case** is where the number we're looking for is the first element in the +array, respectively stem:[x = 13]. In such a case, the function returns immediately: + +[stem] +++++ +T(N) = C_1 +++++ + +**Worst case**, the number we're looking for is not in the array at all, for +example stem:[x = 7]. In that case, the function checks every element in the +array (N amount), eventually returning stem:[-1]: + +[stem] +++++ +T(N) = C_1 N + C_2 +++++ + +We notate this as stem:["L_Search"(A, 7)] having a running time of stem:[T(N) a N] +(worst case). Meanwhile, stem:["L_Search"(A, 13)] has a running time of +stem:[T(N) a 1] (best case).