Skip to content

A Predictive Analysis of H1B Dataset using both supervised and unsupervised learning

Notifications You must be signed in to change notification settings

yingzima/H1B-Predictive-Analysis

Repository files navigation

A Predictive Analysis of H1B Data - Identifying Prospective Customers for Visa Application Agencies

In 2017, more than 600,000 applicants applied for work authorizations in the United States. Many of these individuals choose to use an agent to aid their application process. A ton of previous projects have looked at the outcomes of the applications and I was specifically curious about how visa application agencies can identify prospective customers from the available data source.

This analysis uses a comprehensive dataset that contains hundreds thousands of observations across a number of categories related to these work authorization applications, to build predictive models that attempt to understand the relationship between these captured variables and whether or not an applicant or the company chooses to hire an agent. Via logistic regression, decision trees and clustering analysis, I was able to extract actionable business insights and recommendations for the related companies.