Big-Data-Technologies-Implementations

This repo is aimed at working on the grouplens data to solve simple problems using the different applications in the Hadoop Ecosystem.

The big data tests are based on the open dataset found in ml-100k. The idea behind using only 100K records is that since it is being tested on an individial machine, it makes it simple for computation and validation.

The Big data setup was done using the Hortonworks Sandbox, setup on a Oracle VM.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
MapReduce Jobs		MapReduce Jobs
PigScripts		PigScripts
SparkUsingPython		SparkUsingPython
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big-Data-Technologies-Implementations

MapReduce

PigLatin

SparkUsingPython

About

Releases

Packages

Languages

arvindv17/Big-Data-Technologies-Implementations

Folders and files

Latest commit

History

Repository files navigation

Big-Data-Technologies-Implementations

MapReduce

PigLatin

SparkUsingPython

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages