-
Notifications
You must be signed in to change notification settings - Fork 0
A C++ Hadoop simulated environment for sorting massive data. Used Map-reduce and threading techniques to increases efficiency. Benchmarked the output with TeraGen, and debugged the code using GDB in Linux mint.
josephhany/TeraSort-Project
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
In order to compile the program follow the steps: 1) open the the program folder 2) right click with the mouse on a free space in the folder and choose open in terminal (or place your folder in your "Home" directory then open the terminal and write this command "cd 900182870_Joseph_Lab5_Exercise_1/second_attempt") 3) write "make" in the terminal and press "Enter" key In order to test and run the program follow the steps: 1) open a folder called "bin" in the main program folder 2) right click with the mouse on a free space in the folder and choose open in terminal (or place your folder in your "Home" directory then open the terminal and write this command "cd 900182870_Joseph_Lab5_Exercise_1/second_attempt/bin") 3) write "./FINAL shuffled_students.txt" in the terminal window and press "Enter" key ./FINAL --input-file input.binary --output-file output.binary --mappers 3 --reducers 3 --sample-size 5 ./FINAL --input-file unsorted-1gb --output-file output.binary --mappers 30 --reducers 10 --sample-size 1000
About
A C++ Hadoop simulated environment for sorting massive data. Used Map-reduce and threading techniques to increases efficiency. Benchmarked the output with TeraGen, and debugged the code using GDB in Linux mint.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published