Analyzing Sports Comments from News Media
This is a project to analyze sports comments extracted from news media site. The challenge here is that internet comments in Korean is really difficult to understand. Processing takes time and the there isn't necessarily a standardized way to go about cleaning up the data then figuring out whether or not there are particular trends in the comments.
Thus, the objective is to:
-
Figure out a way to clean up the comments in a way that will give us a representative dataset that is human readable and also analyzable
-
Create a streamlined pipeline that will allow others to utilize (as a module I guess)