Skip to content

Data related to investigation of chat client censorship

Notifications You must be signed in to change notification settings

x-i-ao-b-ai/chat-censorship

 
 

Repository files navigation

Overview

Collection of keyword lists used to censor chat messages on chat apps and live streaming apps used in China. Each of these apps implements censorship on the client-side, which allows us to reverse engineer them and extract the keyword lists used to block content in chat messages. Our analysis shows how keyword lists are downloaded to the applications and how encryption and decryption of lists is implemented (when it is present). Changes to keyword lists are tracked over the data collection period for each application.

Full details on reverse engineering method and results are avalible in reports below:

[Chat program censorship and surveillance in China: Tracking TOM-Skype and Sina UC] (http://firstmonday.org/ojs/index.php/fm/article/view/4628/3727)

[Asia Chats: Investigating Regionally-based Keyword Censorship in LINE] (https://citizenlab.org/2013/11/asia-chats-investigatingregionally-based-keyword-censorship-line/)

[Every Rose Has Its Thorn: Censorship and Surveillance on Social Video Platforms in China] (https://www.usenix.org/conference/foci15/workshop-program/presentation/knockel)

Keyword Content Analysis

Datasets include raw keyword lists collected from the applications and processed datasets that include translations of keywords. Keywords were translated to English using combination of machine and human translation. Based on interpreting these translations with contextual information, we coded each keyword into content categories grouped under six general [themes] (https://github.com/citizenlab/chat-censorship/blob/master/themes_keyword_censorship.csv) according to a [code book] (https://github.com/citizenlab/chat-censorship/blob/master/categories_keyword_censorship.csv)

License

All data is provided under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International and available in full here and summarized here

About

Data related to investigation of chat client censorship

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%