New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

add blenderbot_small, blenderbot #868

Merged

yingyibiao merged 18 commits into PaddlePaddle:develop from kevinng77:develop

Oct 25, 2021

Contributor

kevinng77 commented Aug 10, 2021

PR types

New Features

PR changes

Models

Description

Add Blenderbot, Blendersmall models in paddlenlp/transformers/.

CLAassistant commented Aug 10, 2021 •

edited

Loading

All committers have signed the CLA.

kevinng77 added 2 commits

August 10, 2021 09:54


          add blenderbot_small, blenderbot

868043c


          adjust code comment

1d0001b

yingyibiao self-assigned this


          config info formating; update activation dropout config information

bdb3806

ZeyuChen requested a review from yingyibiao

August 10, 2021 04:44

kevinng77 added 4 commits

August 11, 2021 19:18


          update final logit bias init dtype

6da5164


          add init weight dtype constrain

3df0171


          Updata comment and file formating

57efbcf


          update use_cache relavent for ConditionalGeneration model

f99088c

Contributor

yingyibiao commented Aug 30, 2021

Please install pre-commit for code style formatting as follows:

pip install pre-commit
pre-commit install
make some format changes to all your files in order to commit (e.g.: add some space or blank line)
commit your change

Contributor

yingyibiao commented Aug 30, 2021

Please add docstrings for all your classes and methods which might be utilized by users. You can refer to paddlenlp.transformers.bert.

yingyibiao reviewed

View reviewed changes

paddlenlp/transformers/blenderbot/tokenizer.py Show resolved Hide resolved

paddlenlp/transformers/blenderbot/tokenizer.py Outdated Show resolved Hide resolved

paddlenlp/transformers/blenderbot/tokenizer.py Outdated

Comment on lines 89 to 94

+                      """
+                      Format of Blenderbot sequence: ``X </s>``
+                      :param token_ids_0: List[int]
+                      :param token_ids_1: List[int], optional
+                      :return: List[int]
+                      """

Contributor

yingyibiao Aug 30, 2021

We use Google Style docstrings.
Refer to Bert Model for reference.

paddlenlp/transformers/blenderbot/tokenizer.py

+                          bpe_tokens.extend(
+                              bpe_token for bpe_token in self.bpe(token).split(' '))
+                      return bpe_tokens

Contributor

yingyibiao Aug 30, 2021

Please also included the public tokenize method.


          Update class comment, reorgainze BlenderbotSmallTokenizer using GPTTo…

da625f0

…kenizer , Fix BLenderbotModel use cache bug

yingyibiao reviewed

View reviewed changes

paddlenlp/transformers/blenderbot/tokenizer.py Show resolved Hide resolved

paddlenlp/transformers/blenderbot/modeling.py Outdated

		]


		# Copied from .paddlenlp.transformers.bart.modeling.shift_tokens_right

Contributor

yingyibiao Aug 31, 2021

Delete the starting dot of .paddlenlp.transformers.bart.modeling.shift_tokens_right

paddlenlp/transformers/blenderbot/modeling.py Show resolved Hide resolved

paddlenlp/transformers/blenderbot/modeling.py Outdated



		class BlenderbotLearnedPositionalEmbedding(Embedding):
		def __init__(self, num_embeddings, embedding_dim, padding_idx):

Contributor

yingyibiao Aug 31, 2021

padding_idx is not used.

paddlenlp/transformers/blenderbot/modeling.py Outdated

+                          self.embed_tokens = nn.Embedding(vocab_size, d_model, pad_token_id)
+                      self.embed_scale = math.sqrt(d_model) if scale_embedding else 1.0
+                      self.encoder_embed_positions = BlenderbotLearnedPositionalEmbedding(
+                          max_position_embeddings, d_model, pad_token_id)

Contributor

yingyibiao Aug 31, 2021

remove pad_token_id if you changed the BlenderbotLearnedPositionalEmbedding class definition.

paddlenlp/transformers/blenderbot/modeling.py Outdated

Comment on lines 334 to 344

+                      self.encoder = BlenderbotEncoder(
+                          self.shared, vocab_size, pad_token_id, d_model, num_encoder_layers,
+                          encoder_attention_heads, encoder_ffn_dim, dropout,
+                          activation_function, attention_dropout, activation_dropout,
+                          max_position_embeddings, init_std, scale_embedding, normalize_before)
+                      self.decoder = BlenderbotDecoder(
+                          self.shared, vocab_size, pad_token_id, d_model, num_decoder_layers,
+                          decoder_attention_heads, decoder_ffn_dim, dropout,
+                          activation_function, attention_dropout, activation_dropout,
+                          max_position_embeddings, init_std, scale_embedding, normalize_before)

Contributor

yingyibiao Aug 31, 2021

We need to specify keyword for each named argument.

paddlenlp/transformers/blenderbot/modeling.py Outdated

Comment on lines 366 to 367

		decoder_output = self.decoder(decoder_input_ids, decoder_attention_mask,
		encoder_output, memory_mask, use_cache, cache)

Contributor

yingyibiao Aug 31, 2021

We need to specify keyword for each named argument.

paddlenlp/transformers/blenderbot/modeling.py

+                          "attention_mask": attention_mask,
+                          "use_cache": use_cache,
+                          "cache": cache
+                      }

Contributor

yingyibiao Aug 31, 2021

Please add a class named BlenderbotForCausalLM

paddlenlp/transformers/blenderbot/tokenizer.py Outdated

Comment on lines 63 to 65

+                      super(BlenderbotTokenizer, self).__init__(vocab_file, merges_file, errors,
+                                                                max_len, special_tokens, pad_token,
+                                                                eos_token, eol_token)

Contributor

yingyibiao Aug 31, 2021

We need to specify keyword for each named argument.

paddlenlp/transformers/blenderbot_small/modeling.py Outdated

Comment on lines 1 to 385

+                      self.init_std = init_std
+                      self.pad_token_id = pad_token_id
+                      self.bos_token_id = bos_token_id
+                      self.eos_token_id = eos_token_id
+                      self.decoder_start_token_id = decoder_start_token_id
+                      self.shared = nn.Embedding(vocab_size, d_model, pad_token_id)
+                      self.encoder = BlenderbotSmallEncoder(
+                          self.shared, vocab_size, pad_token_id, d_model, num_encoder_layers,
+                          encoder_attention_heads, encoder_ffn_dim, dropout,
+                          activation_function, attention_dropout, activation_dropout,
+                          max_position_embeddings, init_std, scale_embedding, normalize_before)
+                      self.decoder = BlenderbotSmallDecoder(
+                          self.shared, vocab_size, pad_token_id, d_model, num_decoder_layers,
+                          decoder_attention_heads, decoder_ffn_dim, dropout,
+                          activation_function, attention_dropout, activation_dropout,
+                          max_position_embeddings, init_std, scale_embedding, normalize_before)
+                      self.apply(self.init_weights)
+                  def forward(self,
+                              input_ids=None,
+                              attention_mask=None,
+                              decoder_input_ids=None,
+                              decoder_attention_mask=None,
+                              encoder_output=None,
+                              use_cache=False,
+                              cache=None):
+                      if decoder_input_ids is None:
+                          decoder_input_ids = shift_tokens_right(input_ids,
+                                                                 self.decoder_start_token_id)
+                      if encoder_output is None:
+                          encoder_output = self.encoder(input_ids, attention_mask)
+                      memory_mask = paddle.cast(
+                          input_ids == self.pad_token_id,
+                          dtype=paddle.get_default_dtype()).unsqueeze([1, 2]) * -1e9
+                      memory_mask.stop_gradient = True
+                      decoder_output = self.decoder(decoder_input_ids, decoder_attention_mask,
+                                                    encoder_output, memory_mask, use_cache, cache)
+                      # return encoder output for decoder to generate sequence.
+                      return decoder_output, encoder_output
+              class BlenderbotSmallForConditionalGeneration(BlenderbotSmallPretrainedModel):
+                  def __init__(self, blenderbot_small):
+                      super().__init__()
+                      self.eos_token_id = blenderbot_small.eos_token_id
+                      self.bos_token_id = blenderbot_small.bos_token_id
+                      self.pad_token_id = blenderbot_small.pad_token_id
+                      self.blenderbot_small = blenderbot_small
+                      self.lm_head_weight = self.create_parameter(
+                          shape=[
+                              self.blenderbot_small.config['vocab_size'], self.blenderbot_small.config['d_model']
+                          ],
+                          dtype=self.blenderbot_small.shared.weight.dtype,
+                          is_bias=False)
+                      self.register_buffer("final_logits_bias",
+                                           paddle.zeros((1, self.blenderbot_small.config['vocab_size']),
+                                                        dtype=paddle.get_default_dtype()))
+                      self.apply(self.init_weights)
+                  def forward(self,
+                              input_ids=None,
+                              attention_mask=None,
+                              decoder_input_ids=None,
+                              decoder_attention_mask=None,
+                              encoder_output=None,
+                              use_cache=False,
+                              cache=None):
+                      decoder_outputs, encoder_output = self.blenderbot_small(
+                          input_ids, attention_mask, decoder_input_ids,
+                          decoder_attention_mask, encoder_output, use_cache, cache)
+                      lm_logits = paddle.tensor.matmul(
+                          decoder_outputs[0] if use_cache else decoder_outputs,
+                          self.lm_head_weight,
+                          transpose_y=True) + self.final_logits_bias
+                      if use_cache:
+                          cache = decoder_outputs[1]
+                          return lm_logits, cache
+                      return lm_logits
+                  def prepare_inputs_for_generation(self,
+                                                    decoder_input_ids,
+                                                    attention_mask=None,
+                                                    encoder_output=None,
+                                                    use_cache=True,
+                                                    cache=None,
+                                                    **kwargs):
+                      if cache is not None:
+                          decoder_input_ids = decoder_input_ids[:, -1:].unsqueeze(-1)
+                      return {
+                          "input_ids": None,  # during prediction, Encoder_output is provided, do not need input_ids.
+                          "decoder_input_ids": decoder_input_ids,
+                          "encoder_output": encoder_output,
+                          "attention_mask": attention_mask,
+                          "use_cache": use_cache,
+                          "cache": cache
+                      }

Contributor

yingyibiao Aug 31, 2021

Refer to reviews for Blenderbot modeling.py

kevinng77 and others added 3 commits

September 1, 2021 17:39


          Adjust pos embed generating, comment, use_cache loading, specify meth…

816c4f2

…od key, add blenderbot public tokenize


          update CausalLM

497ad6b


          Merge branch 'develop' into develop

3b18319

Contributor

yingyibiao commented Sep 14, 2021

Thanks for your contributions again! We have recently merged a PR that provides generate-api support for encoder-decoder model. Please add an example for blenderbot models. : )

kevinng77 and others added 5 commits

September 20, 2021 20:28


          update .generate() sample

63c4f7c


          Merge branch 'develop' into develop

7d461c3


          Merge branch 'PaddlePaddle:develop' into develop

6863ce9


          Merge branch 'develop' into develop

a46cfe8


          Merge branch 'develop' into develop

63d8b58

yingyibiao approved these changes

View reviewed changes

Contributor

yingyibiao left a comment

LGTM

yingyibiao added 2 commits

October 25, 2021 10:50


          Merge branch 'develop' into develop

7ac7fe9


          Merge branch 'develop' into develop

fb2e30a

yingyibiao merged commit 6427591 into PaddlePaddle:develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet