This repository has been archived by the owner on Apr 1, 2021. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 307
Create Article : Data-Structures-Trie.md #1064
Merged
Merged
Changes from 8 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
755a053
First Draft of Data-Structures-Trie.md
4322f4a
Added repl
dec4b98
Implemented suggestions
c47fd31
Added delete_word
8fcb390
Moved image to the upper section
050c81c
Formatting corrections
a5939d4
repl fix
9d59049
Added Data Structure prefix to the title
fcf03aa
Added link to Hashtables article
e686ada
Updated reference
ds-249 9bb1484
Updated
ds-249 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# Data Structure Trie | ||
|
||
## Introduction to Trie | ||
|
||
The word trie is an inflix of the word "re**trie**val", because the trie can find a single word in a dictionary with only a prefix of the word. | ||
Trie is an efficient data retrieval data structure, using trie, search complexities can be brought to an optimal limit, i.e. length of the string. | ||
It is a multi-way tree structure useful for storing strings over an alphabet, when we are storing them. | ||
It has been used to store large dictionaries of English, say, words in spell-checking programs. | ||
However, the penalty on tries is the storage requirement. | ||
|
||
## What is a trie? | ||
|
||
A trie is a tree like data structure which stores strings, and helps you find the data associated with that string using the prefix of the string. | ||
For example, say you plan on building a dictionary to store strings along with their meanings. You must be wondering why can't I simply use a hash table, to get the information. | ||
Yes, you obviously can get information using a hash table, but, the hash tables can only find data where the string exactly matches the one we've added. But trie will give us the capability to find strings with common prefixes, a missing character etc in lesser time, in comparison to a hash table. | ||
A trie typically, looks something like this, | ||
|
||
![Trie](https://community.topcoder.com/i/education/alg_tries.png) | ||
|
||
This is an image of a Trie, which stores the words {assoc, algo, all, also, tree, trie}. | ||
|
||
## How to implement a trie? | ||
|
||
Let's implement a trie in python, for storing words with their meanings from english dictionary. | ||
|
||
```python | ||
ALPHABET_SIZE = 26 # For English | ||
|
||
class TrieNode: | ||
def __init__(self): | ||
self.edges = [None]*(ALPHABET_SIZE) # Each index respective to each character. | ||
self.meaning = None # Meaning of the word. | ||
self.ends_here = False # Tells us if the word ends here. | ||
``` | ||
As you can see, edges are 26 in length, each index referring to each character in the alphabet. 'A' corresponding to 0, 'B' to 1, 'C' to 2 ... 'Z' to 25th index. If the character you are looking for is pointing to `None`, that implies the word is not there in the trie. | ||
|
||
A typical Trie should implement at least these two functions: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wrap the function names in inline code blocks. |
||
|
||
- `add_word(word,meaning)` | ||
- `search_word(word)` | ||
- `delete_word(word)` | ||
|
||
Additionally, one can also add something like | ||
|
||
- `get_all_words()` | ||
- `get_all_words_with_prefix(prefix)` | ||
|
||
#### Adding Word to the trie | ||
|
||
```python | ||
def add_word(self,word,meaning): | ||
if len(word)==0: | ||
self.ends_here = True # Because we have reached the end of the word | ||
self.meaning = meaning # Adding the meaning to that node | ||
return | ||
ch = word[0] # First character | ||
# ASCII value of the first character (minus) the ASCII value of 'a'-> the first character of our ALPHABET gives us the index of the edge we have to look up. | ||
index = ord(ch) - ord('a') | ||
if self.edges[index] == None: | ||
# This implies that there's no prefix with this character yet. | ||
new_node = TrieNode() | ||
self.edges[index] = new_node | ||
|
||
self.edges[index].add(word[1:],meaning) #Adding the remaining word | ||
|
||
``` | ||
|
||
#### Retrieving data | ||
|
||
```python | ||
def search_word(self,word): | ||
if len(word)==0: | ||
if self.ends_here: | ||
return True | ||
else: | ||
return "Word doesn't exist in the Trie" | ||
ch = word[0] | ||
index = ord(ch)-ord('a') | ||
if self.edge[index]== None: | ||
return False | ||
else: | ||
return self.edge[index].search_word(word[1:]) | ||
|
||
``` | ||
|
||
The `search_word` function will tell us if the word exists in the Trie or not. Since ours is a dictionary, we need to fetch the meaning as well, now lets declare a function to do that. | ||
|
||
```python | ||
def get_meaning(self,word): | ||
if len(word)==0 : | ||
if self.ends_here: | ||
return self.meaning | ||
else: | ||
return "Word doesn't exist in the Trie" | ||
ch = word[0] | ||
index = ord(ch) - ord('a') | ||
if self.edges[index] == None: | ||
return "Word doesn't exist in the Trie" | ||
else: | ||
return self.edges[index].get_meaning(word[1:]) | ||
``` | ||
|
||
#### Deleting data | ||
|
||
By deleting data, you just need to change the variable `ends_here` to `False`. Doing that doesn't alter the prefixes, but stills deletes the meaning and the existence of the word from the trie. | ||
|
||
```python | ||
def delete_word(self,word): | ||
if len(word)==0: | ||
if self.ends_here: | ||
self.ends_here = False | ||
self.meaning = None | ||
return "Deleted" | ||
else: | ||
return "Word doesn't exist in the Trie" | ||
ch = word[0] | ||
index = ord(ch) - ord('a') | ||
if self.edges[index] == None: | ||
return "Word doesn't exist in the Trie" | ||
else: | ||
return self.edges[index].delete_word(word[1:]) | ||
``` | ||
|
||
:rocket: [Run Code](https://repl.it/CWbr) | ||
|
||
## Resources | ||
|
||
- For further reading, you can try this [topcoder](https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/) tutorial. | ||
- Also, a tutorial from [geeksforgeeks](http://www.geeksforgeeks.org/trie-insert-and-search/) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference hash table : the existing article
Hash-Tables-and-Hashing Functions.md
For referencing, just add
[hash table](the filename)
, no need to give exact paths, it will be picked up relatively.