This project was for the 2024 Carolina Data Challenge. The theme for this challenge was travel.
This repository contains a Python implementation of a Decision Tree Classifier to categorize the 'large_ms' variable into three distinct groups: 'Low', 'Medium', and 'High'. The model utilizes scikit-learn's capabilities to perform the classification task and evaluate its performance through cross-validation.
The data for this project was found at https://www.kaggle.com/datasets/bhavikjikadara/us-airline-flight-routes-and-fares-1993-2024.
This dataset provides detailed information on airline flight routes, fares, and passenger volumes within the United States from 1993 to 2024. The data includes metrics such as the origin and destination cities, distances between airports, the number of passengers, and fare information segmented by different airline carriers. It serves as a comprehensive resource for analyzing trends in air travel, pricing, and carrier competition over a span of three decades.