mrjob

Big Data analysis project using MapReduce in Python to process movie ratings. Includes scripts for aggregating ratings and identifying the most rated movies, demonstrating data analysis on a large scale.

big-data data-preprocessing mrjob mapreduce-python

Updated Jan 3, 2024
Python

mrjuice01 / SharpGenTools

Star

Accurate and high performance C++ interop code generator for C#.

css csharp mrjob

Updated Nov 21, 2023
C

B3EF / mrjob

Star

Run MapReduce jobs on Hadoop or Amazon Web Services

security mrjob 418sec huntr

Updated Jun 7, 2024
Python

shinde-chandrakant / BigData-Ops-on-TLC-Yellow-Taxi

Star

Analysed New York City's Yellow taxi data set with Big Data tools such as Hadoop, HBase, Sqoop, MapReduce and AWS Cloud Infrastructure.

aws hadoop aws-s3 bigdata hbase aws-emr mapreduce aws-rds data-modeling sqoop mrjob big-data-analytics

Updated Sep 18, 2023
Python

heracliteanflux / exercises-scala

Star

Exercises in the Scala programming language with an emphasis on big data programming and applications in Apache Hadoop and Apache Spark.

distributed-systems scala spark apache-spark hadoop distributed-computing map-reduce distributed-file-system mrjob apache-maven apache-hadoop

Updated Aug 22, 2023
Java

thedatasociety / lab-hadoop

Star

hive hadoop hbase flume sqoop hadoop-mapreduce hadoop-streaming mrjob hadoop-hdfs hadoop-yarn

Updated Jan 19, 2024
PLpgSQL

esakik / data-engineering-essentials

Star

Samples related to data engineering, e.g. spark, embulk, airflow, etc.

apache-spark protocol-buffers amazon-emr data-engineering digdag fluentd apache-beam embulk apache-avro mrjob apache-airflow cloud-dataflow apache-hadoop cloud-dataproc

Updated Dec 8, 2022
Python

Tarasa24 / PWA-Store

Star

The largest collection of publicly accessible Progressive Web Apps*

emr golang crawler pwa linode postgresql mrjob commoncrawl puppeteer

Updated Sep 2, 2022
HTML

MFairbro1 / Amazon_Vine_Analysis

Star

Analyzing Amazon product reviews

python sql big-data spark etl postgresql databases pyspark nltk amazon-web-services mrjob jupyer-notebook etl-pipeline google-colab

Updated Aug 27, 2022
Jupyter Notebook

e-panourgia / Big-Data

Star

Big Data Management Systems course assignments

python redis json data latex stream hadoop analytics neo4j azure bigdata mrjob

Updated Jun 9, 2022
JavaScript

aogunwoolu / Ethereum-analysis

Star

ETH analysis using big data for the QMUL Big Data Processing module. Intended to promote analysis of data retrieved via big data processing

python big-data hadoop ethereum hadoop-cluster hadoop-filesystem hadoop-mapreduce mrjob big-data-analytics hadoop-hdfs mrjob-dataproc

Updated Dec 25, 2021
Jupyter Notebook

MHassaanButt / Flight-Delays-Prediction

Star

In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we implement the project using MRJob, PySpark and Spark's MLlib then compare the performance and accuracy of those implementations.

hadoop pyspark decision-tree mrjob spark-mllib

Updated Dec 21, 2021
Jupyter Notebook

helioRocha / dio-ccde-p03

Star

Criando seu Ecossistema de Big Data na Nuvem

aws-s3 python3 aws-emr aws-ec2 aws-iam bootcamp-project mrjob wsl-ubuntu

Updated Nov 16, 2021
Python

anxxos / dockerizing-tweet-analysis-app-mongodb

Star

En esta práctica se empaqueta y distribuye una aplicación Python que descarga y analiza tweets en función de puntuaciones de sentimiento. Los resultados del análisis se guardan en una base de datos MongoDB, y la información se muestra en la web.

docker mongodb mrjob tweet-analysis

Updated Nov 10, 2021
Python

Improve this page

Add a description, image, and links to the mrjob topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mrjob topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mrjob

Here are 55 public repositories matching this topic...

katewang1 / Amazon_Vine_Analysis

jehiah / gomrjob

groda / big_data

sanfx / mapreduce_paradigm_distributed_computing

burhanahmed1 / Big-Data-Analytics

Mariona-FT / Information-Retrieval-REIN

AmitabhCh822 / BigData-MapReduce-MovieRatings-Analysis

mrjuice01 / SharpGenTools

B3EF / mrjob

shinde-chandrakant / BigData-Ops-on-TLC-Yellow-Taxi

heracliteanflux / exercises-scala

thedatasociety / lab-hadoop

esakik / data-engineering-essentials

Tarasa24 / PWA-Store

MFairbro1 / Amazon_Vine_Analysis

e-panourgia / Big-Data

aogunwoolu / Ethereum-analysis

MHassaanButt / Flight-Delays-Prediction

helioRocha / dio-ccde-p03

anxxos / dockerizing-tweet-analysis-app-mongodb

Improve this page

Add this topic to your repo