Problem Statement:
In today’s world, data is power. With News companies having terabytes of data stored in servers, everyone is in the quest to discover insights that add value to the organization. With various examples to quote in which analytics is being used to drive actions, one that stands out is news article classification. Nowadays on the Internet there are a lot of sources that generate immense amounts of daily news. In addition, the demand for information by users has been growing continuously, so it is crucial that the news is classified to allow users to access the information of interest quickly and effectively. This way, the machine learning model for automated news classification could be used to identify topics of untracked news and/or make individual suggestions based on the user’s prior interests.
Approach: Techniques like clustering and associating rule-based algorithms can be applied to group together similar text. The ML algorithms learn the mapping function between the text and the tags based on already categorized data.
Data Exploration : I started exploring dataset using pandas,numpy and pandas-profiling.
Data visualization : Ploted graphs to get insights about dependend and independed variables.
Feature Engineering : Removed missing values and created new features as per insights.
Model Selection I : Tested all base models to check the base accuracy, Also ploted residual plot to check whether a model is a good fit or not.
Pickle File : Selected model as per best accuracy and created pickle file .
Project Title: News Articles Sorting
Technologies: Deep Learning Technology (NLP)
Domain: Media
Project Difficulties level: Intermediate
Technologies Used python nltk numpy pandas matplotlib nltk-word_tokenize PorterStemmer pandas_profiling sklearn
video link of depolyment: https://github.com/sriphaniN/news-articles-sorted/blob/a006cd1a1770cbff717729bfebefb2ff3e51c6d3/project2/video-record.webm