Added constrained decoding (#1536) #2402

mjpost · 2020-07-31T16:37:25Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

This PR implements constrained decoding (Hokamp & Liu, 2017; Post & Vilar, 2018) with vectorization for batching (Hu et al., 2019). In addition, it add ordered constraints, where the constraints are generated on the target side in order, with zero or more unconstrained tokens in between. This variant allows for optimizations that increase speed and BLEU scores (when testing with random scraps from the references).

Usage and quick start

It works with fairseq-interactive via a new command-line option: fairseq-interactive --constraints [ordered,unordered], defaulting to ordered if nothing is provided. When active, it will split lines from STDIN on \t, with separate constraints each separated by a tab. For example (after downloading the Fairseq WMT19 German--English model):

echo -e "Die maschinelle Übersetzung ist schwer zu kontrollieren.\thard\tinfluence" \
  | [normalize.py](https://gist.github.com/mjpost/4c54446b7030d7c64b57461d27090650) \
  | [tok.py](https://gist.github.com/mjpost/ed7456f6a987c533102fc121678ed302) \
  | PYTHONPATH=$HOME/code/fairseq-constraints fairseq-interactive $modeldir \
  --bpe fastbpe \
  --bpe-codes $modeldir/bpecodes \
  --constraints \
  --constraints-both
  -s de -t en \
  --path $modeldir/model1.pt \
  --max-tokens 1000 \
  --beam 5 \

Adding the --constraints-both option causes it to batch-decode the input sentence both with and without the constraints. When run with the Fairseq WMT19 German--English model, the following results are produced (here run on a CPU, don't be alarmed by the times!)

S-0     Die masch@@ in@@ elle Über@@ setzung ist schwer zu kontrollieren .
W-0     1.844   seconds
C-0     hard
C-0     influence
H-0     -1.5333266258239746     Mach@@ ine trans@@ lation is hard to influence .
D-0     -1.5333266258239746     Machine translation is hard to influence .
P-0     -0.5434 -0.1423 -0.1930 -0.1415 -0.2346 -1.8031 -0.1701 -11.7727 -0.1815 -0.1511
S-0     Die masch@@ in@@ elle Über@@ setzung ist schwer zu kontrollieren .
W-0     1.844   seconds
H-0     -0.3731671869754791     Mach@@ ine trans@@ lation is difficult to control .
D-0     -0.3731671869754791     Machine translation is difficult to control .
P-0     -0.5434 -0.1423 -0.1930 -0.1415 -0.2346 -1.1430 -0.1665 -0.8482 -0.1678 -0.1514
2020-07-31 12:17:55 | INFO | fairseq_cli.interactive | Total time: 12.803 seconds; translation time: 3.688

Note the new tags present in the output:

C-# records active constraints (after applying preprocessing) for a sentence
W-# reports the sentence-level translation time (a useful unrelated feature I hope you'll accept)

Some unit tests are written (fairseq/test_constraints.py) but not yet integrated. Advice here on where to place this is welcome. I also have not run this through lint; if someone can tell me the command to run, I'd appreciate it.

Implementation notes

This is largely self-contained, implemented in a new LexicallyConstrainedBeamSearch class in search.py. It does require a few minimal hooks from _generate() in sequence_generator.py, to ensure that constraints are updated at each timestep. (Edit: most changes in that file are documentation clarifications, corrections, and updates). Unconstrained sentences that are intermingled with constrained ones will not incur any time penalty, so long as they do not occur in the same batch.

Addresses #1536.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

mjpost · 2020-08-04T12:46:54Z

I'm marking this as a draft until I can straighten out all the test cases, which are giving me some trouble, in part due to the fact that I get different results running them locally.

mjpost · 2020-08-05T15:59:45Z

Okay, code has been modified and improved such that all tests are passing.

jhcross

This looks great to me, but I'd like one of the Fairseq maintainers to take a look before merging.

fairseq_cli/interactive.py

fairseq/constraints.py

fairseq/search.py

mjpost · 2020-08-07T21:25:02Z

Oh, on the hooks—yes, now I see, this makes sense. I only just added the stubs throwing NotImplementedError at the end, to make the test cases pass, but now that they are there, you're right it makes sense to just call them as NOOPs instead of using an if statement. That simplifies a lot.

alexeib · 2020-08-07T21:35:46Z

re: I agree the current approach is a bit cumbersome. I added all of those in order to get the test cases to pass. One counterargument is that constrained decoding has also been implemented for the Levenshtein Transformer, and there's no reason it couldn't be made to work with the multilingual decoders, too. We could add them just to these subsets, but it might be equally as messy. I'll follow your suggestion here, of course.

if its already applicable to more than one task then thats fine, we can keep the flags in options but maybe we can add the error throwing somewhere else. for example, you can add a property "supports_constraints" on the base task class that returns false, and overwrite it for tasks that do. then in some single places check that property and throw if constraints are set but task does not support them?

mjpost · 2020-08-13T16:02:18Z

I did add this supports_constraints variable, defaulting to False in the base class, and currently only implemented in LexicallyConstrainedBeamSearch. The check is done in generate() since constraints are defined at the batch.

mjpost · 2020-08-17T19:39:02Z

I just added an example of how to use constrained decoding under examples/, and then listed it in the top-level README. I think this is all set.

alexeib

looks good, see a few small comments inline. @myleott may also want to take a look

README.md

fairseq/options.py

examples/constrained_decoding/README.md

myleott

This looks great to me! Made a few comments below. I'm also going to "import" this to run any internal downstream unit/integration tests.

README.md

examples/constrained_decoding/normalize.py

fairseq/search.py

fairseq/sequence_generator.py

scripts/constraints/extract.py

facebook-github-bot

@myleott has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@myleott has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-08-20T20:10:17Z

@myleott merged this pull request in bd1b35d.

PolKul · 2021-04-05T09:19:47Z

Hi @mjpost, I'm trying to adopt your LexicallyConstrainedBeamSearch for the transformer model with the bytelevelbpe tokenizer (sub-word level tokens) but it doesn't seem to work well with it. It is copying the constraints at the beginning of the decoded sequence, then sets end-of-sequence token regardless of the input.

It does work well with the bpe (word level tokenizer) though, producing rich outputs with the correct positions of the constraints in them.

What do you think can be the problem with the bytelevelbpe?

Thanks

PolKul · 2021-04-05T23:55:39Z

My bad, I've just added eos to the constraints :) Problem solved.

mjpost · 2021-04-06T00:03:57Z

Hmm, EOS to the constraints? Doesn't that force them to be applied at the end of the sentence? I'd be curious to understand this better, but I'm glad you have it working.

(BTW, I have found a bug with the constraint tracking. If a constraint is interrupted, instead of starting the tracking over at the beginning of that constraint, it starts over entirely. This only has an effect if you have multiple constraints. I'll have a fix in for this soon).

PolKul · 2021-04-06T00:11:09Z

no, if I add eos to the constraints it would just output the constraints. And setting min_length doesn't help. But I have to say that I'm using it with my custom transformer model, so maybe there is something else which is affecting it. Do you have different behavior with the eos?

mjpost · 2021-04-06T00:14:06Z

I'm not quite sure what you mean by "the eos". I assumed you meant that you were appending it to the constraints. It's hard to answer without knowing exactly what your input and command invocation are.

PolKul · 2021-04-06T00:26:26Z

I mean that adding eos token to the end of the constrains (I have just one constrain) makes the decoder output only that constrain. Changing beam size or min_length doesn't help. I can only produce rich sentences without eos in the constrains. Let me know if you can reproduce the same...

mjpost · 2021-04-06T01:04:26Z

If you post a minimal working example (input and command), I can take a look.

PolKul · 2021-04-06T03:34:42Z

well, its a bit tricky to provide the working example, because, as I say, I'm using it in my custom transformer (from ParlAI library), so I have to stripe a lot of it. But basically, if you initialize it like so:


#tokenize constraint and input
seed = parse('a feeling of attraction')
input = parse('what is love?')
#init constrained beam search 
self.search.init_constraints([seed], beam_size)
#run the rest of the decoder code below... 
....
#if you add [eos] to the seed then the output will be the same as input 
# without [eos] in the seed the output is a nice long sentence

PolKul · 2021-04-12T03:05:24Z

hi @mjpost, I have just added my developments to the ParlAI repo. Can you please check it here https://github.com/PolKul/ParlAI

If you run tests/run_constrained_beam_search.py with your constraints list and see how it works, I would appreciate it. I see several problems:

Currently I cannot use more than one constraint in the list
When using beam_context_block_ngram >0 it would output garbage in the generated text for the second and all consequent utterances

Thanks for you help

PolKul · 2021-04-12T23:46:11Z

I've started new discussion about ParlAI implementation here #facebookresearch/ParlAI#3582

jhkd-kevin · 2021-08-25T15:07:23Z

Before asking:

search the issues.
search the docs.

What is your question?

As I reimplement the part of examples-><constrained_decoding> I finish the example of constrained decoding, but as I want to use my model.pt to instead of the WMT's model.pt to achieve Vi-En's constrained translation, I found this issue：

RuntimeError: Error(s) in loading state_dict for TransformerModel:
Unexpected key(s) in state_dict: "encoder.cons_pos_embed._float_tensor", "encoder.seg_embed.weight", "decoder.ptrnet.linear.weight", "decoder.ptrnet.linear.bias".
size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([42296, 512]) from checkpoint, the shape in current model is torch.Size([42295, 512]).
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([42296, 512]) from checkpoint, the shape in current model is torch.Size([42295, 512]).
size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([42296, 512]) from checkpoint, the shape in current model is torch.Size([42295, 512]).
Code

echo -e "Cảm ơn bạn"
| python normalize.py | python tok.py
| fairseq-interactive /public/home/zhchynnu/perl5/ourmodel/examples/constrained_decoding/data
--path /public/home/zhchynnu/perl5/ourmodel/examples/constrained_decoding/path/ourmodel.pt
--bpe fastbpe
--bpe-codes /public/home/zhchynnu/perl5/ourmodel/examples/constrained_decoding/path/ourbpecodes
--constraints
-s vi -t en
--beam 10
What have you tried?

I guess the question might be related to the class of PT file
What's your environment?

fairseq Version (e.g., 1.0 or master):fairseq==0.10.2
PyTorch Version (e.g., 1.0)torch==1.5.0+cu101
OS (e.g., Linux):Linux
How you installed fairseq (pip, source):pip
Build command you used (if compiling from source):
Python version:3.6
CUDA/cuDNN version:10.1
GPU models and configuration: 2060
Any other relevant @InFormation:No

@mjpost

mjpost added 12 commits July 31, 2020 12:21

Added constrained decoding (Post & Vilar, 2018)

9665f5a

removed nonexistent device

8eda87f

Consolidated on LanguagePairDataset

72e16ed

removed stray mention of ConstrainedDataset

82ca082

named argument, cleanup

e2e2d67

build_dataset_for_inference stubs

4a26f66

type() -> isinstance()

87d48f7

removed torchscript-incompatible isinstance()

d6c7a41

moved to explicit parameter in search

7843790

fiddling since local tests pass

68d63a2

converted to packed Tensor representation

c4a1f74

fixed old test case error (not mine)

b93c64b

mjpost marked this pull request as draft August 4, 2020 12:37

call signature

1cf6271

mjpost added 5 commits August 4, 2020 10:23

constraints arg to inference_step

5cd6c3b

removed comments

19a47ab

Improved packed constraint structure and input handling

451d6b2

bugfix, unpack returns tensors

2e0a773

incorporated test cases

d4100e9

mjpost marked this pull request as ready for review August 5, 2020 15:59

mjpost added 3 commits August 5, 2020 14:54

minor cleanup

e84d18a

"id" is a reserved word

740e9f9

bugfix handling 0 constraints; added packing test case to catch it

2d2b8dd

jhcross reviewed Aug 6, 2020

View reviewed changes

fairseq_cli/interactive.py Show resolved Hide resolved

fairseq/constraints.py Outdated Show resolved Hide resolved

fairseq/search.py Outdated Show resolved Hide resolved

fairseq/search.py Show resolved Hide resolved

clarified comments, fleshed out base class

e64feaf

alexeib requested review from liuchen9494, myleott and alexeib August 7, 2020 19:58

renamed file; documentation

ff44d99

mjpost added 2 commits August 17, 2020 13:23

Merge branch 'master' into constraints

661ddc5

added example

8a8fc19

alexeib approved these changes Aug 17, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

fairseq/options.py Outdated Show resolved Hide resolved

examples/constrained_decoding/README.md Outdated Show resolved Hide resolved

cleanup

643af66

myleott approved these changes Aug 18, 2020

View reviewed changes

facebook-github-bot reviewed Aug 18, 2020

View reviewed changes

Added headers; switched function documentation style; cleanup

ba8cc13

facebook-github-bot reviewed Aug 19, 2020

View reviewed changes

facebook-github-bot closed this in bd1b35d Aug 20, 2020

facebook-github-bot added the Merged label Aug 20, 2020

mjpost mentioned this pull request Apr 5, 2021

Lexically constrained beam search with with byte-level BPE #3448

Closed

kcarnold mentioned this pull request Apr 10, 2021

Sequential constraints? huggingface/transformers#11180

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added constrained decoding (#1536) #2402

Added constrained decoding (#1536) #2402

mjpost commented Jul 31, 2020 •

edited

Loading

mjpost commented Aug 4, 2020

mjpost commented Aug 5, 2020

jhcross left a comment

mjpost commented Aug 7, 2020

alexeib commented Aug 7, 2020

mjpost commented Aug 13, 2020

mjpost commented Aug 17, 2020

alexeib left a comment

myleott left a comment

facebook-github-bot left a comment

facebook-github-bot left a comment

facebook-github-bot commented Aug 20, 2020

PolKul commented Apr 5, 2021

PolKul commented Apr 5, 2021

mjpost commented Apr 6, 2021

PolKul commented Apr 6, 2021

mjpost commented Apr 6, 2021

PolKul commented Apr 6, 2021

mjpost commented Apr 6, 2021

PolKul commented Apr 6, 2021 •

edited

Loading

PolKul commented Apr 12, 2021

PolKul commented Apr 12, 2021

jhkd-kevin commented Aug 25, 2021

Added constrained decoding (#1536) #2402

Added constrained decoding (#1536) #2402

Conversation

mjpost commented Jul 31, 2020 • edited Loading

Before submitting

What does this PR do?

Usage and quick start

Implementation notes

PR review

Did you have fun?

mjpost commented Aug 4, 2020

mjpost commented Aug 5, 2020

jhcross left a comment

Choose a reason for hiding this comment

mjpost commented Aug 7, 2020

alexeib commented Aug 7, 2020

mjpost commented Aug 13, 2020

mjpost commented Aug 17, 2020

alexeib left a comment

Choose a reason for hiding this comment

myleott left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Aug 20, 2020

PolKul commented Apr 5, 2021

PolKul commented Apr 5, 2021

mjpost commented Apr 6, 2021

PolKul commented Apr 6, 2021

mjpost commented Apr 6, 2021

PolKul commented Apr 6, 2021

mjpost commented Apr 6, 2021

PolKul commented Apr 6, 2021 • edited Loading

PolKul commented Apr 12, 2021

PolKul commented Apr 12, 2021

jhkd-kevin commented Aug 25, 2021

mjpost commented Jul 31, 2020 •

edited

Loading

PolKul commented Apr 6, 2021 •

edited

Loading