Greedy search and beam search #2557

KexinFeng · 2023-04-20T04:55:10Z

Description

This PR succeeds PR #2547 and #2509. The model tracing is shown therein.

Benchmarked with huggingface transnformers' output.

Ref. https://huggingface.co/blog/how-to-generate

Demo output

In the demo TestLMSearch.java, we feed in batch sequence input, using right padding with the space token ' ' (id = 220).

["DeepMind Company is",  
 "Memories follow me left and right. I can"]

Output of beam search (numBeam=3, maxLength=50):

'DeepMind Company is      \xa0 a company that has been around for a long time and has been around for a long time. They have been around for a long time and have been around for a long time. They have been'
'DeepMind Company is      \xa0 a company that has been around for a long time. It has been around for a long time and has been around for a long time. It has been around for a long time and has been'
'DeepMind Company is      \xa0 a company that has been around for a long time and has been around for a long time. They have been around for a long time and have been around for a long time. They are a'

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"
"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person.\n"
'Memories follow me left and right. I can\'t tell you how many times I\'ve been told that I\'m not a good person. I\'m not a good person. I\'m not a good person. I\'m not a good person."\n'

Output of greedy search (maxLength=50):

'DeepMind Company is      \xa0 a company that has been around for over 20 years. We have been around for over 20 years and have been around for over 20 years. We have been around for over 20 years and have been'

"Memories follow me left and right. I can't remember the last time I saw a girl in a dress. I can't remember the last time I saw a girl in a dress. I can't remember the last time I saw a girl in"

Notes about the GPT2's behaviour with padding and attention mask:

This notes shows that for GPT2, the right padding and left padding can behave very differently, either with attention_mask or without. (The situation with the attention_mask is the most intuitive method). The reason is not totally interpretible yet. But this result will guide the next batching solution.

With attention mask which is set 0 on the padded tokens, and 1 everywhere else, for the input_ids that correpsonds to the above input (the space token ' ' id is 220).

A. Right padding:

input_ids = torch.tensor([[29744, 28478, 5834, 318, 220, 220, 220, 220, 220, 220],
                   [13579, 1749, 1061, 502, 1364, 290, 826, 13, 314, 460]])

Output:

"DeepMind Company is      \xa0 a company that has been around for a long time and has been around for a long time. They have been around for a long time and have been around for a long time. They have been"

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"

B. Left padding

input_ids = torch.tensor([[220, 220, 220, 220, 220, 220, 29744, 28478, 5834, 318],
                   [13579, 1749, 1061, 502, 1364, 290, 826, 13, 314, 460]])

"      DeepMind Company is                                        "

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"

Without any attention mask (i.e. all set to 1), for the input_ids that correpsonds to the above input (the space token ' ' id is 220).

A. Right padding
output:

"DeepMind Company is                                                                                  "

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"

B. Left padding
output =

'      DeepMind Company is a subsidiary of DeepMind Technologies, Inc. DeepMind Technologies, Inc. is a subsidiary of DeepMind Technologies, Inc. is a subsidiary of DeepMind Technologies, Inc. is a subsidiary of Deep'

Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I.

KexinFeng · 2023-06-21T04:16:45Z

Merged in #2637

POC of LLMDecoder

0f8863f

KexinFeng requested review from zachgk, frankfliu and a team as code owners April 20, 2023 04:55

KexinFeng closed this Apr 25, 2023

KexinFeng deleted the greedy_and_beam branch April 25, 2023 06:38

KexinFeng restored the greedy_and_beam branch April 25, 2023 06:39

KexinFeng reopened this Apr 25, 2023

KexinFeng mentioned this pull request May 3, 2023

Batch the sequences with ContrastiveSeqBatchScheduler #2572

Closed

KexinFeng added 2 commits May 11, 2023 11:54

constrastiveSearch

1a8b39b

greedy_and_beam

6df14ac

KexinFeng force-pushed the greedy_and_beam branch from 71588b1 to 6df14ac Compare May 11, 2023 18:55

KexinFeng mentioned this pull request Jun 14, 2023

[api] implements text-generation search algorithm #2637

Merged

KexinFeng closed this Jun 21, 2023

xyang16 deleted the greedy_and_beam branch October 4, 2023 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Greedy search and beam search #2557

Greedy search and beam search #2557

KexinFeng commented Apr 20, 2023 •

edited

Loading

KexinFeng commented Jun 21, 2023

Greedy search and beam search #2557

Greedy search and beam search #2557

Conversation

KexinFeng commented Apr 20, 2023 • edited Loading

Description

Demo output

Notes about the GPT2's behaviour with padding and attention mask:

KexinFeng commented Jun 21, 2023

KexinFeng commented Apr 20, 2023 •

edited

Loading