You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains a data pipeline that extracts, transforms and loads data from an AWS S3 bucket into an AWS Redshift table using AWS Glue. The raw data is made available in AWS S3 in its raw form and then the pipeline enables AWS Glue extract the raw data from S3 bucket.
Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!
A comprehensive tutorials / steps / scripts for setting up Apache Kafka on an Amazon EC2 instance, streaming logs to S3, and querying data with AWS Glue and Amazon Athena. Includes Zookeeper configuration, producer and consumer setup, and automated data catalog creation
Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.
Collecting the list of songs,album and artists list details from the Spotify Music Application in specific intervals using spotipy API and performing ETL Operations using Amazon Cloud Services
In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.
The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.
Implemented ETL pipeline on AWS for Playstore data using Lambda, Glue Crawlers, and Glue ETL Jobs. Orchestrated workflow with Step Functions and achieved seamless integration, optimal data merging, and enhanced data quality/accessibility.
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau