SentiMP-21 Dataset

The SentiMP-21 Dataset is a multilingual sentiment analysis dataset based on tweets written by members of parliament in Greece, Spain and United Kingdom in 2021. It has been developed collaboratively by the Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI) research group from the University of Granada and the Cardiff NLP research group from the University of Cardiff.

Dataset details

The dataset contains 1500 tweets from three different countries: Greece (500 tweets), Spain (500 tweets) and United Kingdom (500 tweets). For each tweet we provide the following information:

tweet_id: Which represents the identifier of each tweet.
full_text: Which containts the content of the tweet.
mp_party: Party to which the member of parliament who wrote the tweet belongs.
mp_name: Name of the member of parliament who wrote the tweet.
created_at: Date of the tweet.
label_i : Annotator's i label (i in {1,2,3} for English and Greek and i in {1,2,3,4,5} for Spanish). It takes values in {-1,0,1,x}.
majority_vote: The result after applying the majority vote strategy to the annotators' partial labelling. When there is a tie we use the label "TIE". It takes values in {-1,0,1,TIE}.
tie_break: We use this column to break ties in cases where there is a tie. Therefore, it is only completed when TIE appears in the majority_vote column. It takes values in {-1,0,1}.
final_label: It represents the final label. It is a combination between the majority_vote abd the tie_break columns. It takes values in {-1,0,1}.

Downloads

We release three different version for each of the datasets:

Extended version (full): We include all the columns for each of the initial 500 tweets.
Extended version (without x): We delete the tweets labeled with "x" from the previous version.
Simple version: It only keeps the columns tweet_id, full_text and final_label from the previous version.

You can find these files in the following repositories:

Citation

If you use this dataset, please cite:

Contact

Nuria Rodríguez Barroso - rbnuria@ugr.es

Shield:

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
gr		gr
sp		sp
uk		uk
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SentiMP-21 Dataset

Dataset details

Downloads

Citation

Contact

About

Releases

Packages

ari-dasci/OD-SentiMP-21

Folders and files

Latest commit

History

Repository files navigation

SentiMP-21 Dataset

Dataset details

Downloads

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages