Missing dataset config names in P3 #31

ari9dam · 2022-07-12T01:40:17Z

The readme says that For reproducibility, we have released an already cached version of the data(https://huggingface.co/datasets/bigscience/P3), which means you don't need to cache the data yourself. The only exception is Story Cloze (https://cs.rochester.edu/nlp/rocstories/). However upon inspecting I see that there are 92 config names that are missing in P3 but present in the seqio.mixture. Here is my code to find out the missing dataset config names:

import datasets
import seqio
import t0.seqio_tasks

ds = datasets.get_dataset_config_names("bigscience/P3")

t0_train_task_names = [task.name for task in seqio.MixtureRegistry.get("t0_train").tasks]
t0p_train_task_names = [task.name for task in seqio.MixtureRegistry.get("t0+_train").tasks]
t0pp_train_task_names = [task.name for task in seqio.MixtureRegistry.get("t0++_train").tasks]
t0_eval_score_eval = [task.name for task in seqio.MixtureRegistry.get("t0_eval_score_eval").tasks]
t0_train_score_eval = [task.name for task in seqio.MixtureRegistry.get("t0_train_score_eval").tasks]

all_tasks = set()
for tasks in [t0_train_task_names, t0p_train_task_names, t0pp_train_task_names, t0_eval_score_eval, t0_train_score_eval]:
    all_tasks.update(tasks)

missing = []
for t in all_tasks:
    if t not in ds:
        missing.append(t)
print(missing)

The missing pieces are:
wiki_qa_found_on_google_score_eval quartz_having_read_above_passage_score_eval ag_news_classify_score_eval cosmos_qa_no_prompt_text_score_eval glue_qqp_duplicate_or_not_score_eval wiki_qa_Decide_good_answer_score_eval cosmos_qa_context_description_question_answer_text_score_eval rotten_tomatoes_Writer_Expressed_Sentiment_score_eval qasc_qa_with_separated_facts_1_score_eval sciq_Direct_Question_score_eval qasc_qa_with_separated_facts_2_score_eval quartz_paragraph_question_plain_concat_score_eval cosmos_qa_no_prompt_id_score_eval story_cloze_2016_Novel_Correct_Ending_score_eval wiki_qa_Is_This_True__score_eval quail_context_question_answer_description_id_score_eval quartz_given_the_fact_answer_the_q_score_eval sciq_Multiple_Choice_Question_First_score_eval rotten_tomatoes_Reviewer_Expressed_Sentiment_score_eval quail_context_question_description_answer_id_score_eval qasc_qa_with_separated_facts_3_score_eval yelp_review_full_format_rating_score_eval quail_context_question_answer_description_text_score_eval story_cloze_2016_Story_Continuation_and_Options_score_eval yelp_review_full_on_a_scale_score_eval rotten_tomatoes_Movie_Expressed_Sentiment_2_score_eval yelp_review_full_this_place_score_eval ag_news_which_section_choices_score_eval cos_e_v1.11_question_option_description_text_score_eval glue_mrpc_want_to_know_score_eval cos_e_v1.11_description_question_option_text_score_eval wiki_qa_exercise_score_eval cos_e_v1.11_question_description_option_text_score_eval qasc_qa_with_separated_facts_4_score_eval glue_mrpc_replace_score_eval quail_description_context_question_answer_text_score_eval glue_qqp_meaning_score_eval quail_context_description_question_answer_text_score_eval glue_mrpc_same_thing_score_eval rotten_tomatoes_Reviewer_Opinion_bad_good_choices_score_eval cosmos_qa_context_question_description_answer_text_score_eval quail_no_prompt_id_score_eval cos_e_v1.11_description_question_option_id_score_eval cos_e_v1.11_question_description_option_id_score_eval qasc_qa_with_separated_facts_5_score_eval cosmos_qa_context_description_question_text_score_eval yelp_review_full_format_score_score_eval ag_news_classify_with_choices_score_eval cosmos_qa_context_description_question_answer_id_score_eval quartz_use_info_from_paragraph_question_score_eval quail_description_context_question_answer_id_score_eval cosmos_qa_description_context_question_text_score_eval glue_mrpc_equivalent_score_eval yelp_review_full_based_on_that_score_eval social_i_qa_Show_choices_and_generate_index_score_eval quail_no_prompt_text_score_eval ag_news_which_section_score_eval rotten_tomatoes_Sentiment_with_choices__score_eval quartz_answer_question_below_score_eval social_i_qa_Show_choices_and_generate_answer_score_eval glue_qqp_duplicate_score_eval ag_news_recommend_score_eval story_cloze_2016_Answer_Given_options_score_eval cosmos_qa_description_context_question_answer_text_score_eval ag_news_classify_with_choices_question_first_score_eval quartz_read_passage_below_choose_score_eval quail_context_question_description_answer_text_score_eval wiki_qa_automatic_system_score_eval sciq_Multiple_Choice_score_eval story_cloze_2016_Movie_What_Happens_Next_score_eval quartz_use_info_from_question_paragraph_score_eval yelp_review_full_format_star_score_eval sciq_Direct_Question_Closed_Book__score_eval social_i_qa_I_was_wondering_score_eval ag_news_classify_question_first_score_eval glue_qqp_quora_score_eval yelp_review_full_so_i_would_score_eval story_cloze_2016_Choose_Story_Ending_score_eval quartz_answer_question_based_on_score_eval cos_e_v1.11_question_option_description_id_score_eval social_i_qa_Generate_answer_score_eval rotten_tomatoes_Reviewer_Enjoyment_score_eval rotten_tomatoes_Movie_Expressed_Sentiment_score_eval glue_qqp_same_thing_score_eval quail_context_description_question_answer_id_score_eval rotten_tomatoes_Text_Expressed_Sentiment_score_eval cosmos_qa_description_context_question_answer_id_score_eval rotten_tomatoes_Reviewer_Enjoyment_Yes_No_score_eval rotten_tomatoes_Reviewer_Sentiment_Feeling_score_eval glue_mrpc_paraphrase_score_eval cosmos_qa_context_question_description_answer_id_score_eval social_i_qa_Check_if_a_random_answer_is_valid_or_not_score_eval
Could you please shade some light? I'm trying to reproduce (close enough is good) the T0 Training in pytorch. Thank you.

The text was updated successfully, but these errors were encountered:

ari9dam · 2022-07-12T03:28:40Z

It seems that the missing datasets are from the mix t0_eval_score_eval and t0_train_score_eval.

VictorSanh · 2022-07-12T08:43:13Z

Hi @ari9dam , thanks for the ping!

Except for the 5 of story cloze (which we won't release because it's behind an agreement), all the 87 missing are the ones in t0_train_score_eval which is the evaluation sets from the training datasets. We use the evaluation set to perform checkpoint selection.
I didn't think someone would ever need them, I am happy to include them into bigscience/P3 too, though i won't have the bandwidth to do so before next week.

However, I have them cached somewhere if you need them. Could you email me?

ari9dam changed the title ~~Missing config in P3~~ Missing dataset config names in P3 Jul 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing dataset config names in P3 #31

Missing dataset config names in P3 #31

ari9dam commented Jul 12, 2022

ari9dam commented Jul 12, 2022

VictorSanh commented Jul 12, 2022

Missing dataset config names in P3 #31

Missing dataset config names in P3 #31

Comments

ari9dam commented Jul 12, 2022

ari9dam commented Jul 12, 2022

VictorSanh commented Jul 12, 2022