Skip to content

Text mining of customer reviews extracted from TrustPilot

Notifications You must be signed in to change notification settings

RSAKIB78/Text-Mining-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text Analysis through Web Scraping

This project aims to collect customer review data from the web through utilizing appropriate web scraping techniques and performs text mining on the data subsequently through constructing a database from scratch.

Table of Contents

General Information

The project is structured in three (3) parts.

The first part (Part A) covers constructing and demonstrating the handling of text data. It aims to implement the principles of text mining, the bag-of-words model and the development of metrics that can be used to analyze structural elements of text, such as normalizing and cleaning textual corpora. The core of this project involves the translation of these insights to actionable features that can be used to predict an outcome variable of business interest.

The second and third parts (Part B and Part C) are concerned with the identification of features and in particular (a) polarity – whether the text under consideration is positive or negative, (b) sentiment – the extraction of affective states from the text and (c) the evaluation and extraction of important topics that are covered and elaborated in the corpus that have been constructed (Part C).

Technologies Used

  • RStudio

Features

  • Bag-of-Words Analysis
  • Word Clouds
  • Top words and frequent words analysis
  • Sentiment Analysis
  • Topic Mining
  • LDA (Latent Dirichlet Allocation)

Project Status

Project is: complete

Releases

No releases published

Packages

No packages published