Skip to content

Commit

Permalink
Merge pull request #37 from lamafab/lamafab-algos-data-ii
Browse files Browse the repository at this point in the history
Cheatsheets for CM2035 Algorithms & Data Structures II
  • Loading branch information
sglavoie authored Sep 10, 2024
2 parents ea76782 + f2dcb6f commit 55c9d06
Show file tree
Hide file tree
Showing 16 changed files with 1,113 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# About

Listed here is a collection of cheatsheet by topic. Those cheatsheets do not
explain the topics in depth, but rather serve as quick lookup documents.
Therefore, the course material provided by the lecturer should still be studied
and understood. Not everything that is tested at the mid-terms or final exams is
covered and the Author does not guarantee that the cheatsheets are free of
errors.

* [Time and Space Complexity](./cheatsheet_time_space_complexity.pdf)
* [Asymptotic Analysis](./cheatsheet_asymptotic_analysis.pdf)
* [Time Complexity of Recursive Algorithms](.cheatsheet_time_complexity_recursive_algorithms.pdf)
* [Comparison and Non-Comparison Sorting Algorithms](./cheatsheet_sorting_algorithms.pdf)
* [Hash Tables](./cheatsheet_hash_tables.pdf)

**NOTE**: Those cheatsheets only cover the course material **up to the midterms**.
The weeks after the midterms are not covered here.

# Building

_NOTE_: This step is only necessary if you chose to modify the base documents.

The base documents are written in [AsciiDoc](https://asciidoc.org/) and can be
found in the `src/` directory.

The following dependencies must be installed (Ubuntu):

```console
$ apt install -y ruby-dev wkhtmltopdf
$ gem install asciidoctor
$ chmod +x build.sh
```

To build the documents (PDF version):

```console
$ ./build.sh pdf
```

Optionally, for the HTML version:

```console
$ ./build.sh html
```

and for the PNG version:

```console
$ ./build.sh png
```

The generated output can be deleted with `./build.sh clean`.

# Disclaimer

The Presented Documents ("cheatsheets") by the Author ("Fabio Lama") are
summaries of specific topics. The term "cheatsheet" implies that the Presented
Documents are intended to be used as learning aids or as references for
practicing and does not imply that the Presented Documents should be used for
inappropriate practices during exams such as cheating or other offenses.

The Presented Documents are heavily based on the learning material provided by
the University of London, respectively the VLeBooks Collection database in the
Online Library and the material provided on the Coursera platform.

The Presented Documents may incorporate direct or indirect definitions,
examples, descriptions, graphs, sentences and/or other content used in those
provided materials. **At no point does the Author present the work or ideas
incorporated in the Presented Documents as their own.**
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
#!/bin/bash

# Because `make` sucks.

gen_html() {
# Remove suffix and prefix
FILE=$1
OUT=${FILE%.adoc}
HTML_OUT="cheatsheet_${OUT}.html"

asciidoctor $FILE -o ${HTML_OUT}
}

# Change directory to src/ in order to have images included correctly.
cd "$(dirname "$0")/src/"

case $1 in
html)
for FILE in *.adoc
do
# Generate HTML file.
gen_html ${FILE}
done

# Move up from src/
mv *.html ../
;;
pdf)
for FILE in *.adoc
do
# Generate HTML file.
gen_html ${FILE}

# Convert HTML to PDF.
PDF_OUT="cheatsheet_${OUT}.pdf"
wkhtmltopdf \
--enable-local-file-access \
--javascript-delay 2000\
$HTML_OUT $PDF_OUT
done

# Move up from src/
mv *.pdf ../

# Cleanup temporarily generated HTML files.
rm *.html > /dev/null 2>&1
;;
png | img)
for FILE in *.adoc
do
# Generate HTML file.
gen_html ${FILE}

# Convert HTML to PNG.
IMG_OUT="cheatsheet_${OUT}.png"
wkhtmltopdf \
--enable-local-file-access \
--javascript-delay 2000\
$HTML_OUT $IMG_OUT
done

# Move up from src/
mv *.png ../

# Cleanup temporarily generated HTML files.
rm *.html > /dev/null 2>&1
;;
clean)
rm *.html > /dev/null 2>&1
rm *.png > /dev/null 2>&1
rm ../*.html > /dev/null 2>&1
rm ../*.png > /dev/null 2>&1
;;
*)
echo "Unrecognized command"
;;
esac
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
= Cheatsheet - Asymptotic Analysis
Fabio Lama <fabio.lama@pm.me>
:description: Module: CM2035 Algorithms and Data Structures II, started April 2024
:doctype: article
:sectnums: 4
:toclevels: 4
:stem:

== About

Asymptotic analysis is an alternative way of describing the time or memory
requirements of an algorithm.

== Big O Notation

Big O notation stem:[O(x)] defines a set of functions that act as an **upper bound**
stem:[g(N)] for stem:[T(N)]. Formally defined as:

stem:[T(N)] is stem:[O(g(N))] if there exist positive
constants stem:[c] and stem:[n_0] such that:

[stem]
++++
T(N) <= c xx g(N) " for all " N > n_0
++++

Note that there can be **multiple functions** stem:[g_x(N)] that act as **an upper bound**
for stem:[T(N)]. Additionally, do notice that it's **not necessary** that
stem:[c xx g(N)] is equal to or greater than stem:[T(N)] for all values of
stem:[N].

image::assets/big_o_notation.png[align=center, width=300]

For example, consider:

[stem]
++++
T(N) = 10 N^2 + 15N + 5\
g(N) = N^2\
c = 1
++++

Here, stem:[c xx g(N)] is never greater than stem:[T(N)], because there is no
solution for:

[stem]
++++
10 N^2 + 15N + 5 <= 1 xx N^2
++++

However, consider:

[stem]
++++
c = 25
++++

In case of stem:[N = 1] we get:

[stem]
++++
10 xx 1^2 + 15 xx 1 + 5 <= 25 xx 1^2\
= 10 + 15 + 5 <= 25\
= 30 <= 25
++++

Which is false. However, for stem:[N = 2] we get:

[stem]
++++
10 xx 2^2 + 15 xx 2 + 5 <= 25 xx 2^2\
= 40 + 30 + 5 <= 100\
= 75 <= 100
++++

Which is true. Therefore:

[stem]
++++
T(N) " is " O(N^2) " because"\
T(N) <= 25 xx g(N) " for all " N >= 2
++++

There choice for stem:[c] **is arbitrary**, as long as it satisfies the conditions.

== Omega Notation

The Omega notation stem:[Omega(x)] defines a set of functions that act as a
**lower bound** stem:[g(N)] for (stem:[T(N)]). Formally defined as:

stem:[T(N)] is stem:[Omega(g(N))] if there exist positive constants stem:[c] and
stem:[n_0] such that:

[stem]
++++
T(N) >= c xx g(N) " for all " N > n_0
++++

Similarly to the Big O notation, there can be **multiple functions** stem:[g_x(N)]
that act as **a lower bound** for stem:[T(N)] and it's **not necessary** that
stem:[c xx g(N)] is equal to or less than stem:[T(N)] for all values of
stem:[N], but only for the larger values.

image::assets/omega_notation.png[align=center, width=300]

== Theta Notation

The Theta notation stem:[Theta(x)] defines a **single function** that acts as
both an **upper and lower bound** for (stem:[T(N)]). Formally defined as:

stem:[T(N)] is stem:[Theta(g(N))] if there exist positive constants stem:[c_1],
stem:[c_2] and stem:[n_o] such that both those conditions hold true:

[stem]
++++
T(N) >= c_1 xx g(N) " for all " N > n_0\
T(N) <= c_2 xx g(N) " for all " N > n_0
++++

Alternatively:

[stem]
++++
c_1 xx g(N) <= T(N) <= c_2 xx g(N) " for all " N > n_0
++++

image::assets/theta_notation.png[align=center, width=300]

As already noted, Theta notation has **only one function**.
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
= Cheatsheet - Hash Tables
Fabio Lama <fabio.lama@pm.me>
:description: Module: CM2035 Algorithms and Data Structures II, started April 2024
:doctype: article
:sectnums: 4
:toclevels: 4
:stem:

== About

A hash table is a data structure that maps keys to values using a hash function. This function transforms the input (key) into a fixed-size integer, which serves as an index in an array, enabling fast data retrieval. Hash tables are widely used due to their average stem:[Theta(1)] time complexity for both insertion and lookup operations.

== Search Algorithms Complexity

WARNING: This table assumes no collisions in the hash table.

|===
|Name |Worst case (time) |Best case (time)| Worst case (space) |Best case (space)

|Linear Search
|stem:[Theta(N)]
|stem:[Theta(1)]
|stem:[Theta(1)]
|stem:[Theta(1)]

|Binary Search (iterative)
|stem:[Theta(log N)]
|stem:[Theta(1)]
|stem:[Theta(1)]
|stem:[Theta(1)]

|Binary Search (recursive)
|stem:[Theta(log N)]
|stem:[Theta(1)]
|stem:[Theta(log N)]
|stem:[Theta(1)]

|Direct Addressing
|stem:[Theta(1)]
|stem:[Theta(1)]
|stem:[Theta(k)]
|stem:[Theta(k)]

|Hash Table
|stem:[Theta(1)]
|stem:[Theta(1)]
|stem:[Theta(M)]
|stem:[Theta(M)]
|===

Where stem:[k] is the maximum possible key value and stem:[M] is the size of the hash table.

== Hash Tables

Hash tables use an index of an array to represent a number. Consider an array of
size 7 and the follwing, simple hash function:

[stem]
++++
h(x) = x mod 7
++++

For example, lets say stem:[x = 11] and we compute:

[stem]
++++
h(11) = 11 mod 7 = 4
++++

This means the number 11 is stored in the array at index 4. Knowing the index,
the search for that number is very fast. However, given that the size of this
array is very small, **the number of collisions can be very high**, depending on
number of inputs.

For example, both 11 and 18 would be stored at index 4:

[stem]
++++
h(11) = 11 mod 7 = 4\
h(18) = 18 mod 7 = 4
++++

If necessary, the hash table can be **extended** with a larger array, but this
requires rehashing all the existing elements. Alternatively, the method of
**linear probing** can be applied to use the next available index in case of a
collision (note that this information must be stored somewhere). Or, each index
can be a pointer to a separate, nested table, which is a **separate chaining** method.

The best case and worst case complexity for hash tables must consider those
collision handling methods as well, as in how it behaves with no collisions at
all and with all elements colliding, respectively.
Loading

0 comments on commit 55c9d06

Please sign in to comment.