You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
git clone git@github.com:laihuiyuan/mFLAG.git
cd mFLAG
frommodelimportMultiFigurativeGenerationfromtokenization_mflagimportMFlagTokenizerFasttokenizer=MFlagTokenizerFast.from_pretrained('laihuiyuan/mFLAG')
model=MultiFigurativeGeneration.from_pretrained('laihuiyuan/mFLAG')
# an example for hyperbole-to-sarcasm generation# a token (<hyperbole>) is added at the beginning of the source sentence to indicate its figure of speechinp_ids=tokenizer.encode("<hyperbole> I am not happy that he urged me to finish all the hardest tasks in the world", return_tensors="pt")
# the target figurative form (<sarcasm>)fig_ids=tokenizer.encode("<sarcasm>", add_special_tokens=False, return_tensors="pt")
outs=model.generate(input_ids=inp_ids[:, 1:], fig_ids=fig_ids, forced_bos_token_id=fig_ids.item(), num_beams=5, max_length=60,)
text=tokenizer.decode(outs[0, 2:].tolist(), skip_special_tokens=True, clean_up_tokenization_spaces=False)
# special tokens: <literal>, <hyperbole>, <idiom>, <sarcasm>, <metaphor>, or <simile>
Training
Step 1: Pre-training
python train_pt.py -dataset ParapFG -figs hyperbole idiom metaphor sarcasm simile
Step 2: Fine-tuning
# parallel paraphrase pretraining data
python train_ft.py -dataset ParapFG -figs hyperbole idiom metaphor sarcasm simile
# literal-figurative parallel data
python train_ft.py -dataset MultiFG -figs hyperbole idiom metaphor sarcasm simile
Step 3: Figurative Generation
# Generating idioms form hyperbolic text
python inference.py -src_form hyperbole -tgt_form idiom
Model and Outputs
Our model mFLAG can be found in Hugging Face, the corresponding outputs are in the /data/outputs/ directory
Citation
@inproceedings{lai-etal-2022-multi,
title = "Multi-Figurative Language Generation",
author = "Lai, Huiyuan and Nissim, Malvina",
booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
month = October,
year = "2022",
address = "Gyeongju, Republic of korea",
}