Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

parlai train_model - text_truncate, label_truncate and truncate - per turn or episode? #5004

Answered by klshuster
krisstud asked this question in Q&A
Discussion options

You must be logged in to vote

Sorry, the terminology I used was a bit confusing. For BB3B, I am actually referring to BlenderBot 1.0, 3B parameter version. That model has 128 context length (see section 6.1 of the corresponding paper).

You have a very large dataset. Generally, when we train with datasets where episodes contain > 1024 tokens of context, we simply truncate the older context. It is an open problem how to deal with extremely long context, as in your use case.

If you are looking at role-playing or staying in character, we offer a few other datasets in ParlAI that are dialogue-adjacent:

  1. The LIGHT dataset is a dataset where two agents role-play as characters in a medieval-fantasy text adventure game. The da…

Replies: 1 comment 9 replies

Comment options

You must be logged in to vote
9 replies
@krisstud
Comment options

@klshuster
Comment options

Answer selected by krisstud
@krisstud
Comment options

@klshuster
Comment options

@krisstud
Comment options

@klshuster
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants