Below you will find a collection of projects divided into four main categories:
- Data Analysis and Visualisation
- Machine Learning
- Probability and Statistics
- SQL
-
-
- Objective: To understand if workers have resigned due to some kind of dissatisfaction.
- Tech Stack: Python, Pandas, NumPy, Matplotlib, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning and Data Wrangling techniques.
- Key Achievement: Identified that employees with 7 or more years of service are more likely to resign due to some kind of dissatisfaction with the job.
-
- Objective: To clean the data and analyse the included used car listings.
- Tech Stack: Python, Pandas, Numpy, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning and Data Wrangling techniques.
- Key Achievement: Found that German manufacturers represent more than 60% of the overall listings. Volkswagen is by far the most popular brand.
-
- Objective: To determine which type of post and time receive the most comments on average.
- Tech Stack: Python, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning and Data Wrangling techniques.
- Key Achievement: Identified the post to be categorised as Ask HN post and created between 20:00 - 21:00 GMT.
-
- Objective: To determine a few indicators of heavy traffic on I-94 Interstate highway.
- Tech Stack: Python, Pandas, Matplotlib, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning/Wrangling and Data Visualisation techniques.
- Key Achievement: Identified two types of heavy traffic indicators: time and weather.
-
- Objective: To determine what content a data science education company should create.
- Tech Stack: Python, Pandas, Matplotlib, Seaborn, SQL, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning/Wrangling and Data Visualisation techniques.
- Key Achievement: Identified Deep Learning as the content to be created.
-
- Objective: To find free mobile apps that are profitable for the App Store and Google Play markets.
- Tech Stack: Python, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning and Data Wrangling techniques.
- Key Achievement: Found that turning a popular book into an app could be profitable for both markets. However, special features should be added to the app.
-
- Objective: To clean and explore the dataset in order to answer some questions about Star Wars fans.
- Tech Stack: Python, Pandas, NumPy, Matplotlib, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning/Wrangling and Data Visualisation techniques.
- Key Achievement: Found that "The Empire Strikes Back" is the most seen and liked Star Wars movie by the respondents.
-
-
-
- Objective: To explore the effectiveness of deep, feedforward neural networks in image classification.
- Tech Stack: Python, Pandas, NumPy, Scikit-Learn, Matplotlib, Jupyter Notebook
- Solution: Applied Neural Networks with different hidden layers and numbers of neurons.
- Key Achievement: Found that using more hidden layers increase the amount of overfitting.
-
- Objective: To create different machine learning models and evaluate their performances.
- Tech Stack: Python, Pandas, NumPy, Scikit-Learn, Matplotlib, Jupyter Notebook
- Solution: Applied Linear Regression, Decision Tree and Random Forest algorithms using their default settings and RMSE as the error metric.
- Key Achievement: Obtained the best accuracy when using the Random Forest model.
-
- Objective: To predict a car's market price using the k-nearest neighbors algorithm.
- Tech Stack: Python, Pandas, NumPy, Scikit-Learn, Matplotlib, Jupyter Notebook
- Solution: Created Univariate and Multivariate models using different values of the hyperparameter k.
- Key Achievement: Obtained the best result with a Multivariate model using two best features and k=2.
-
- Objective: To predict house sale prices using Linear Regression algorithm.
- Tech Stack: Python, Pandas, NumPy, Scikit-Learn, Jupyter Notebook
- Solution: Performed Feature Engineering and Feature Selection techniques and evaluated the model using k-fold cross-validation.
- Key Achievement: Obtained the best result using k-fold cross-validation with k=4.
-
-
-
- Objective: To create a spam filter that classifies new SMS messages as spam or non-spam.
- Tech Stack: Python, Pandas, Jupyter Notebook
- Solution: Used Conditional Probability concepts to build a function that works like the Multinomial Naive Bayes algorithm.
- Key Achievement: Managed to build a spam filter for SMS messages with an accuracy of 98.74% on the test set.
-
- Objective: To find out the best markets to advertise programming courses.
- Tech Stack: Python, Pandas, Matplotlib, Jupyter Notebook
- Solution: Performed Exploratory Data Analysis, Data Cleaning/Wrangling and Data Visualisation techniques.
- Key Achievement: Found the US as the best market to advertise in.
-
- Objective: To find any difference between Fandango's ratings for popular movies in 2015 and 2016.
- Tech Stack: Python, Pandas, NumPy, Matplotlib, Jupyter Notebook
- Solution: Performed Descriptive Statistical Analysis and Data Visualisation techniques.
- Key Achievement: On average, popular movies released in 2016 were rated lower on Fandango than popular movies released in 2015.
-
- Objective:To practise applying probability and combinatorics concepts in a setting that simulates a real-world scenario.
- Tech Stack: Python, Pandas, Jupyter Notebook
- Solution: Built functions that calculate Factorials, Combinations and Probabilities.
- Key Achievement: Managed to write functions that calculate the probabilities of winning any prize with one or more tickets besides checking historical lottery data.
-
-
-
- Objective: To query the database to fetch demographic information about countries around the world.
- Tech Stack: SQL, Jupyter Notebook
- Solution: Built SQL queries to compute Summary Statistics.
- Key Achievement: Created queries and subqueries to answer demographic questions.
-
- Objective: To query the database and extract information for decision making.
- Tech Stack: SQL, Jupyter Notebook
- Solution: Built SQL queries to answer Business Questions.
- Key Achievement: Found that the 'Rock' genre accounts for 53% of sales by itself.
-
- Objective: To analyse data from sales records and extract information for decision making.
- Tech Stack: SQL, Jupyter Notebook
- Solution: Built SQL queries to answer Business Questions.
- Key Achievement: Found the priority products for restocking and how much to spend on acquiring new customers.
-
Please feel free to contact me if you have any questions or comments.