GitHub - josephhany/TeraSort-Project: A C++ Hadoop simulated environment for sorting massive data. Used Map-reduce and threading techniques to increases efficiency. Benchmarked the output with TeraGen, and debugged the code using GDB in Linux mint.

josephhany / TeraSort-Project Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

A C++ Hadoop simulated environment for sorting massive data. Used Map-reduce and threading techniques to increases efficiency. Benchmarked the output with TeraGen, and debugged the code using GDB in Linux mint.

0 stars 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Screenshots		Screenshots
Untitled Folder		Untitled Folder
bin		bin
headers		headers
objects		objects
sources		sources
makefile		makefile
makefile.vars		makefile.vars
readme		readme
report.pdf		report.pdf

Repository files navigation

In order to compile the program follow the steps:

	1) open the the program folder

	2) right click with the mouse on a free space in the folder and choose open in terminal
	
	(or place your folder in your "Home" directory then open the terminal and write this command "cd 900182870_Joseph_Lab5_Exercise_1/second_attempt")

	3) write "make" in the terminal and press "Enter" key

In order to test and run the program follow the steps:

	1) open a folder called "bin" in the main program folder

	2) right click with the mouse on a free space in the folder and choose open in terminal

	(or place your folder in your "Home" directory then open the terminal and write this command "cd 900182870_Joseph_Lab5_Exercise_1/second_attempt/bin")

	3) write "./FINAL shuffled_students.txt" in the terminal window and press "Enter" key

./FINAL --input-file input.binary --output-file output.binary --mappers 3 --reducers 3 --sample-size 5
./FINAL --input-file unsorted-1gb --output-file output.binary --mappers 30 --reducers 10 --sample-size 1000