This module captures network packets real-time and saves them to a PCAP file. This module should be able to capture packets from multiple network interfaces simultaneously
This module reads PCAP files and parses network packets into a format that can be used by our machine learning model.
This repository contains an analysis of the network intrusion data set using a different classifier machine learning algorithm. The data set has been preprocessed and analyzed using NumPy, Pandas, and Plotly to understand the underlying patterns and relationships in the data. The random forest classifier has been implemented using Scikit-learn to predict network intrusion based on the available features.
The repository includes Jupyter notebooks that showcase the data preprocessing steps, exploratory data analysis, and the machine learning model implementation. The notebooks are documented with markdown cells to explain the thought process and reasoning behind each step.
The data preprocessing steps include handling missing values, dealing with categorical variables, and scaling the features. Exploratory data analysis is done using Plotly to create interactive visualizations that help to understand the structure of the data. The machine learning model is trained on the preprocessed data and evaluated using various performance metrics.
The goal of this repository is to provide a comprehensive guide to preprocessing and analyzing the network intrusion data set and implementing a random forest classifier machine learning model. It is intended for anyone interested in learning about data preprocessing and machine learning using Python libraries such as NumPy, Pandas, Plotly, and Scikit-learn.
ML Model: numpy, pandas, plotly ,sklearn
Other Tool and library used: npcap, libcap, scapy, Apache Hadoop ( a distributed computing framework)