Topic-Modeling

Latent Dirichlet allocation (LDA) is a topic model that generates topics based on word frequency from a set of documents. LDA is particularly useful for finding reasonably accurate mixtures of topics within a given document set. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions.

Data

The data set being used is a list of over one million news headlines published over a period of 15 years and can be downloaded from Kaggle - https://www.kaggle.com/therohk/million-headlines/data

Packages

The following Python packages will be used:

genism
nltk

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
README.md		README.md
Topic Modeling.ipynb		Topic Modeling.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Topic-Modeling

Data

Packages

About

Releases

Packages

Languages

sanjanagupta16/Topic-Modeling

Folders and files

Latest commit

History

Repository files navigation

Topic-Modeling

Data

Packages

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages