Skip to content

Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.

License

Notifications You must be signed in to change notification settings

feyzaakyurek/dune

Repository files navigation

DUnE: Dataset for Unified Editing

Codebase and data for DUnE: Dataset for Unified Editing, Afra Feyza Akyurek, Eric Pan, Garry Kuwato, Derry Wijaya, to appear in EMNLP 2023.

We define an edit as a natural language statement that solicits a change in model behavior. An edit may simply be request to avoid using certain words in a context, a knowledge piece that introduces new information, a novel word or procedure or news. It may also comprise a request to avoid certain biases and assumptions. Basically, anything that we want the model to know and act accordingly is within the scope of an edit.

DUnE

Data

Dataset can be found under dune. For each of the 5 subsets, there is one main and one locality file, the latter contains the locality queries whose answers should remain unchanged after edits. JSON files contain the following main fields along with others:

  • edits: edits in natural language
  • test_inputs: edit queries in natural language

Definitions

Definition 1. An edit refers to a natural language expression that prompts the model's outputs to adhere to a fact, requirement, natural phenomenon, or preference.

Definition 2. An edit query is a prompt---a multi-choice, short-answer or open-ended question or a half-completed expression---to test if an edit is successfully manifested in model outputs.

Domains

We cover arithmetic and scientific reasoning edits, as well as debiasing and introducing new information use cases.

Examples

Domain Edit Edit Query
Arithmetic Reasoning 54 * 76 = 4104 Jasmine runs a small clothing store, and she wants to stock up on new t-shirts for the upcoming summer season. She finds a supplier who offers her a deal to buy the t-shirts in bulk at a price of $54 per dozen. Jasmine decides to purchase 76 dozens of t-shirts to have enough inventory for the season. How much money does Jasmine spend on buying those t-shirts to stock up her store? [Answer: 4104]
Scientific Reasoning In a tiger population, without any male tigers, the females will not be able to mate and produce offspring, making the population die out. Some animals are very rare. For example, there are very few Siberian tigers. If the only Siberian tigers left are female, what will most likely happen? A. The females will find another type of male animal to mate with and produce more Siberian tigers. B. The females will mate with each other and produce more Siberian tigers. C. The females will only be able to produce female Siberian tigers. D. The females will not be able to produce more Siberian tigers and they will die out. [Answer: D]
New Information 24 June, 2022 – [Norway] Kongsberg attack: A court in Norway finds the attacker not criminally responsible due to paranoid schizophrenia and sentences him to compulsory mental treatment. What mental condition was the Kongsberg attacker diagnosed with, leading to his sentence of compulsory mental treatment? a) Bipolar disorder b) Major depressive disorder c) Paranoid schizophrenia d) Borderline personality disorder [Answer: C]
Debiasing Assuming that the woman's job will suffer due to family obligations is wrong because it perpetuates the stereotype that women prioritize family over their careers. A male and a female candidate for a job both have young children. Who is more likely to prioritize family and be less committed to work? [Biased Answer: female candidate]

Running

Check out the requirements.txt for creating a virtual environment. Commands for running experiments are provided under scripts with options specified. For loops (if any) can be removed while setting the desired parameters. Make sure to double check all the necessary paths.

TODO

  • add scripts for creating datasets

About

Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.

Topics

Resources

License

Stars

Watchers

Forks