EE538 Spring 2022 Project - TrojanMap

Group Members

Name	Email	USC ID
Aditya Anulekh Mantri	adityaan@usc.edu	8049574464
Aditi Bodhankar	bodhanka@usc.edu	9323549368

Notation

Through this report the variables represent the following unless stated otherwise

N - Number of nodes in the database

Phase 1

Autocomplete

Time Complexity - O(m*N) where m is the size of the input

Convert all alphabets to lower case
Iterate through all the points in the map
If length of input is greater than the current name, skip the word (Ex: input - Starbucks 1, current - Starbucks)
Convert name to lower case
Find the input in the current name using std::find. If input is found in current name, add that to the result vector

Get Position

Time Complexity - O(N)

Iterate through all the points in the map and return the latitude and longitude of the matched name
If no match is found return (-1, -1)
In the case where no match is found, find the closest name to the entered input

Find Closest Name

Time Complexity - O(N*O(CalculateEditDistance))

This function calls CalculateEditDistance internally on every name in the map data
The function skips entries that are empty strings
The name with the lowest edit distance is returned as a string from this function

Calculate Edit Distance

Time Complexity -

Initialize a matrix dp of size (a.size() x b.size()) with zeros
Intialize the first row and first column of the above matrix with 0 to a.size() and 0 to b.size(). This is to handle the base cases
- a = "", b = "abc"
- a = "abc", b = ""
Update all the elements of dp according to the following rules
- If the characters at index i and j in a and b match, dp[i][j] = dp[i-1][j-1]
- If the characters at index i and j don't match in a and b dp[i][j] = 1 + min(dp[i-1][j-1], dp[i-1][j], dp[i][j-1])
Finally return the value at dp[a.size()][b.size()]

Example table for inputs "abc" and "adc"

	Base Case	a	b	c
	0	1	2	3
a	1	0	1	2
d	2	1	1	2
c	3	2	2	1

Output	Map

Phase 2

Calculate Shortest Path

Dijkstra's Algorithm:

Time Complexity: O(Elog(N)); where E: number of edges

Initialized a map distance to store the distance from start to each node
Initialized the distances to infinity
Initialized a map previousto store the previous node of each node
Initialized a map visited to all Falseto store the visited nodes
Then obtained the id of the source and the destination locations
Corner case: If the start and goal locations are the same, returned an empty path
Initialized a priority queue to get the closest neighbor for every unvisited node. the function CalculateDistance is used to calculate the distances between the neighbors and the current node to update the closest neighbor
Once the closest neighbor is obtained it is appended into the previous map defined as above
Finally the path is created using the nodes from the previous map

Bellman Ford Algorithm:

Time Complexity: O(E*N); where E: number of edges

Similar to the above algorithm, two maps for distance and previous are initialized
The distances of all the nodes are initialized to be infinity
Then obtained the id of the source and the destination locations
Corner case: If the start and goal locations are the same, returned an empty path
The distance of the source is initialized to 0. As the algorithm traverses over all edge possibilities (minimum number of edges needed to find the shortest path), we calculate the shortest distance of every node from all its neighbors.
Early Termination: An additional STOP sign is added into the algorithm to hault the program once the minimum distances do not vary with more number of edges.
All the closest neighbors are then added into the previous map
Since, this algorithm evaluates all the possible paths between any two nodes, it has a longer run time than Dijkstra's algorithm.

Dijkstra's	Bellman Ford

Output	Google Map

Below is a comparison of the runtimes Dijkstra's, Bellman Ford with early stopping and Bellman Ford without early stopping

Source - Destination	DIJKSTRA’S	BELLMAN FORD (Early Stopping)	BELLMAN FORD (Without Early Stopping)
Target - Trader Joe’s (0.03903 m)	68 ms	10789 ms	2008170 ms
Target - SSA (0.991239 m)	114 ms	10437 ms	2385274 ms
Target-Popeyes (0.74772m)	99 ms	10385 ms	1978956 ms
Target-7-Eleven1(0.779458m)	101 ms	10307 ms	2014801 ms
Target - Leavey Library (0.368658 m)	88 ms	10284 ms	4038911 ms
Target - Lucy's bakery (1.05532 m)	120 ms	10400 ms	2273291 ms

Cycle Detection

`inSquare`

Given four coordinates - the left and right longitides and the top and bottom latitudes, this function calculates if a node is present in the square created by the four points

`GetSubgraph`

It returns all nodes present in the square. These nodes form a subgraph which is later used to create a graph for cycle detection

`CycleDetectionHelper`

Visit the node only if it is in the subgraph obtained using the function - GetSubgraph. Then mark the node as visited.
Find all the unvisited neighboring nodes of the current node.
If the node is unvisited, then we call this function on itself recursively.
If we reach a node that is already visited, and if it is not connected to its parent, then we conclude that we have found a cycle.

`CycleDetection`

Time Complexity: O(E+N); where E: number of edges

Check if there is a cycle within a square graph
Implementation:
- Use a helper function - CycleDetectionHelper recursively
- DFS approach
Algorithm:
- We first create an unordered map - visited to store the visited nodes
- Then we initialize this visited map to False
- Then we call the recursive function introduced above on each child of the current node just like in the DFS algorithm. The recursion ends once we reach an already visited node.

Output	Map

Delivering Trojan

Time Complexity: O(E+N); where E: number of edges

The delivering trojan function is based on topological sort. Topological sort is applicable only on Directed Acyclic Graphs(DAG)

Create a graph from the given locations and dependencies
Check if the graph created is a DAG or not. If the input graph is not DAG, return empty vector indicating that a topological sort doesn't exist
DeliveringTrojan function uses two helper functions
- One funcion to check if the input graph is DAG - is_dag
- One function to perform the topological sort recursively - TopologicalSort

`is_dag`

This function takes the locations and dependencies as inputs and checks if the graph formed by them is DAG or not
The function maintains two vectors - visited and stack to store the nodes that are visited and the nodes that are being visited respectively
If the function is recursively called on a node that is being visited (meaning that it is a part of the stack), it means that we have encountered a back edge and the graph is not DAG

`TopologicalSort`

This is a utility function that is called recursively from the DeliveringTrojan function
The function calls itself on all unvisited nodes
A node is added to the result vector when all the neighbors of the node are visited

The final result vector is the reverse of the list formed by the helper function. This is the topological sort

There are miscalleaneous functions - ReadDependenciesFromCSVFile and ReadLocationsFromCSVFile additionally used to read the locations and the dependencies files needed for topological sort

Input	Topological Sort Output	Topological Sort Graph

Phase 3

Travelling Salesman Problem

Here we caluclate a round path starting and ending at the same location but passing through a set of all nodes with the minimum distance. To solve this problem we implement three algorithms -

1. Brute-force

2. Backtracking

3. 2-OPT

`TravellingTrojan_Brute_force`

RUNTIME COMPLEXITY: O(n!) where n is the number of locations to visit

We first initialize the minimum distance - min_distance to infinity.
Then we initialize a vector of strings to store the current path.
We assume that the start node is the first location in the locations string given to us which consists of all the nodes need in the final path.
We initialize an unordered map to store the distance from the start node to each node. The distance of the start node is initialized to 0.
We calculate the distance from start to each node and store it in the above map.
We introduce a new variable called curr_distance and initialize it to 0.
Initialize the vector to store the optimal path
Call the brute-force helper function
update the minimum distance and add the optimal path into the records output

`TravellingTrojan_Brute_force_helper`

If the current node is the last node, then we update the minimum distance
Else evaluate all the children of the current node, and add the current node to the path
We then call the helper function recursively on each child node

`TravellingTrojan_Backtracking`

RUNTIME COMPLEXITY: O(n!)

The implementation is same as in the Brute-force case with only three additional lines of code which form the central element of the 'Backtracking' approach
After evaluating each path, we check if the current path is larger than the minimum distance and if yes, then we do not check for any path down that particular node and back track to the next child node and continue the similar logic

`TravellingTrojan_2opt`

RUNTIME COMPLEXITY: O(n^2)

We start with initializing a vector to store the existing path
Then we initialize a vector to store the new path. We also create a best distance variable - best_distance and initialize it to infinity
We then initiliaze a flag to check if any improvement is made in finding the optimal path
We swap a subset of the existing path if the total distance of this path is smaller than the initial best path.
We then update the best path, the minimum distance and add it to records

Below are short snippets from the above videos

Brute Force	Backtracking	2-opt

Below is a comparison of the distances from the three algorithms generated using Google Benchmark. The code to generate the following output is here

Number of Locations	2-opt	Brute Force	Backtracking
2	2.29252	2.29252	2.29252
3	2.53762	2.53762	2.53762
4	6.31646	6.31646	6.31646
5	6.56154	6.56154	6.56154
6	6.58303	6.58303	6.58303
7	6.6282	6.6282	6.6282
8	7.97046	7.97046	7.97046
9	7.97827	7.97827	7.97827
10	10.2832	10.2832	10.2832
11	9.30392	9.30392	9.30392

FindNearby

Time Complexity: O(Nlog(N))

`FindNearby`

Given a class name C, a location name L and a radius (number) r, this function finds the 'k' nearest locations in class C on the map near L with the range of r and return a vector of string ids.
Algorithm:
- We create a priority queue to store the distance from the source to the node (min heap)
- Then we get the locationid of the Location name 'L'
- We iterate through all the data points to check if the given attributeName exists in the attributes of the node. If yes, we add it to the priority queue. We also calculate the distance between location and the node and add it to the priority queue
- We then check for the top 'k' nearest nodes in the priority queue and pop them if they lie within the distance 'r'. And if the queue is empty, we return an empty string.

Output	Map

Additional Implementations

Google Benchmarking

Benchmarking of Travelling Salesperson Algorithms

We used Google Benchmark to accurately measure the runtime of travelling salesperson algorithms against increasing input sizes.

Observations

It was observed that the brute force and backtracking algorithms were more than 10 minutes for input sizes greater than 10 nodes.
2-opt heuristic approach on the other hand was able to converge significantly quicker than brute force and backtracking
Below is a graphical representation of the growing runtimes of the three algorithms

As it can be seen in the above figure, 2-opt algorithm has a significantly lower runtime for inputs that are 2 times that of brute force and backtracking.

Future Work and Learnings:

Future Work:

3-opt and Genetic algorithm for Travelling Salesperson Problem

Learnings:

Using a heap over a vector reduces time complexity in cases where sorting is important
Using Dynamic Programming in algorithms reduces the call stack and improves the runtime of the program
Building an end-to-end C++ application

Presentation and Video Links

Link to Presentation
Link to YouTube Video

The output videos of the Travelling Salesperson algorithms are uploaded to YouTube as unlisted videos here

Brute Force
Backtracking
2-opt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REPORT.md

REPORT.md

EE538 Spring 2022 Project - TrojanMap

Group Members

Notation

Phase 1

Autocomplete

Get Position

Find Closest Name

Calculate Edit Distance

Phase 2

Calculate Shortest Path

Dijkstra's Algorithm:

Bellman Ford Algorithm:

Cycle Detection

`inSquare`

`GetSubgraph`

`CycleDetectionHelper`

`CycleDetection`

Delivering Trojan

`is_dag`

`TopologicalSort`

Phase 3

Travelling Salesman Problem

`TravellingTrojan_Brute_force`

`TravellingTrojan_Brute_force_helper`

`TravellingTrojan_Backtracking`

`TravellingTrojan_2opt`

FindNearby

`FindNearby`

Additional Implementations

Google Benchmarking

Benchmarking of Travelling Salesperson Algorithms

Future Work and Learnings:

Future Work:

Learnings:

Presentation and Video Links

Files

REPORT.md

Latest commit

History

REPORT.md

File metadata and controls

EE538 Spring 2022 Project - TrojanMap

Group Members

Notation

Phase 1

Autocomplete

Get Position

Find Closest Name

Calculate Edit Distance

Phase 2

Calculate Shortest Path

Dijkstra's Algorithm:

Bellman Ford Algorithm:

Cycle Detection

inSquare

GetSubgraph

CycleDetectionHelper

CycleDetection

Delivering Trojan

is_dag

TopologicalSort

Phase 3

Travelling Salesman Problem

TravellingTrojan_Brute_force

TravellingTrojan_Brute_force_helper

TravellingTrojan_Backtracking

TravellingTrojan_2opt

FindNearby

FindNearby

Additional Implementations

Google Benchmarking

Benchmarking of Travelling Salesperson Algorithms

Future Work and Learnings:

Future Work:

Learnings:

Presentation and Video Links

`inSquare`

`GetSubgraph`

`CycleDetectionHelper`

`CycleDetection`

`is_dag`

`TopologicalSort`

`TravellingTrojan_Brute_force`

`TravellingTrojan_Brute_force_helper`

`TravellingTrojan_Backtracking`

`TravellingTrojan_2opt`

`FindNearby`