The Instahyre Job Analytics project is a data-driven endeavor that involves web scraping job posting data from Instahyre, followed by thorough data preprocessing and clustering analysis. The project's primary goal is to empower users with insights into the job market.
This project culminated in the development of an interactive web application, offering users the opportunity to explore job market insights, trends, and valuable information. Whether you are a job seeker or a recruiter, this project is designed to provide you with a comprehensive and user-friendly tool for making informed decisions in the job market.
-
Data Extraction: Our project employs Selenium, a powerful web automation tool, to extract data from the InstaHyre website. This data includes job listings, company information, and various other relevant details.
-
Data Analysis: To make sense of the extracted data, we implement a K-means clustering model. This model groups job listings and companies into clusters based on common characteristics, allowing users to identify trends and patterns.
-
Webpage Creation: We've created an interactive webpage using HTML and CSS that displays the analyzed data in a user-friendly and visually appealing manner.
K-Means is a machine learning technique used for data clustering. It groups similar data points into clusters based on their characteristics. In our project, we used K-Means clustering to discover patterns and relationships within job listings and companies, making it easier to identify trends and insights in the data.
A Silhouette Score is a metric used to evaluate the quality of clusters in K-Means clustering or other clustering algorithms Silhouette Score of 0.97998 indicates strong and well-separated clusters in your K-Means model.
Elbow clustering, often used in K-Means, is a technique to determine the optimal number of clusters (k) for a dataset. It involves running K-Means for a range of k values and then looking for the "elbow point" on a plot of the cost or variance against k. The elbow point indicates the point where increasing the number of clusters no longer significantly reduces the within-cluster variance.
By applying elbow clustering, we aim to find the most appropriate number of clusters for our data, ensuring that our K-Means analysis is well-structured and provides meaningful insights.
Designed a user-friendly and visually appealing webpage to showcase the results of analysis. This webpage was created using HTML and CSS, combining elegant design with informative content.
video1857558097.mp4
During the development of this project, we encountered several challenges, which included:
-
Webpage Design: Designing a webpage with HTML and CSS presented challenges in making it both visually appealing and functional.
-
User Input Handling: Managing user input for text processing and learning was complex, requiring effective validation and processing mechanisms.
-
Backend Development: Developing the backend with Flask was a bit challenging, particularly when setting up data communication between the frontend and backend components.
-
Web Scraping: Web scraping using the Selenium library presented its own set of challenges, from handling dynamic web pages to efficient data extraction.
These challenges were crucial in our learning process and in making the project robust and user-friendly.
In summary, the project successfully provided insights into the job market through data analysis and web development, despite various data-related challenges.
In conclusion, the Instahyre Job Analytics project successfully extracts, analyzes, and presents job market data. Utilizing Selenium for web scraping and Scikit-learn for K-Means clustering, the project delivers an interactive webpage for users to explore insights. Despite challenges in design, user input handling, backend development, and web scraping, the team's commitment shines through. The project stands as a robust, user-friendly tool, providing valuable job market information for both seekers and recruiters.