Skip to content

csinchok/fcc-comment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FCC Comment Analysis

This reposity has Python code designed to download FCC data, storing it in an ElasticSearch instance. There's an additional command to tag and analyze the data further.

After a first pass in a Jupyter Notebook, I used Kibana on AWS to do most of my digging.

To install the package and run tests:

$ pip install -e .
$ python setup.py test

To crawl the comments, make sure you have a server setup, and then run:

$ fcc index --endpoint=http://localhost:9200/

This will take anywhere from 2-4 hours (or wont' work at all, if the API is down).

To get a smaller subset of comments for testing, add -g YYYY-MM-DD to get comments submitted after the specified date:

$ fcc index --endpoint=http://localhost:9200/ -g 2017-06-01

I then take another pass on the data, appending "analysis" variables to all of the documents. This makes it a lot easier to spot trends in Kibana.

To analyze the comments:

$ fcc analyze --endpoint=http://localhost:9200/

About

Code to index and Analyze FCC comments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages