-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blank output in the Inference while using a customize trained T5 model #226
Comments
Hi @gaurav21s By blank, you mean all the outputs are empty-zeroed tensors ? Do you have any warning / error messages ? |
Hi Jonathan,
By blank means, it was giving blank space in the output.
…On Thu, Jan 5, 2023 at 3:46 PM Jonathan Marchand ***@***.***> wrote:
Hi @gaurav21s <https://github.com/gaurav21s>
By blank, you mean all the outputs are empty-zeroed tensors ? Do you have
any warning / error messages ?
—
Reply to this email directly, view it on GitHub
<#226 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQKIAQZQLV3HAFYCQIPTHCLWQ2NOPANCNFSM6AAAAAATOR6YKI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I was running the inference for testing and the tokenizer was working fine
but when I was getting output after the decode it was just blank space.
On Thu, Jan 5, 2023 at 6:22 PM Gaurav Shrivastav <
***@***.***> wrote:
… Hi Jonathan,
By blank means, it was giving blank space in the output.
On Thu, Jan 5, 2023 at 3:46 PM Jonathan Marchand ***@***.***>
wrote:
> Hi @gaurav21s <https://github.com/gaurav21s>
>
> By blank, you mean all the outputs are empty-zeroed tensors ? Do you have
> any warning / error messages ?
>
> —
> Reply to this email directly, view it on GitHub
> <#226 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AQKIAQZQLV3HAFYCQIPTHCLWQ2NOPANCNFSM6AAAAAATOR6YKI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Can confirm this problem exists also in e2e t5 tutorial, if one changes For environment, I used this repo's Dockerfile to create VScode dev container inside Windows wsl2 with RTX3090-24GB. |
Thanks for your reports, I can confirm this bug and we're investigating it. Simple code to reproduce it: import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from kernl.model_optimization import optimize_model
model_name = "t5-base"
model = AutoModelForSeq2SeqLM.from_pretrained(pretrained_model_name_or_path=model_name).eval().cuda()
tokenizer = AutoTokenizer.from_pretrained(model_name)
input_ids = tokenizer(
"translate English to French: The house in the woods is wonderful, can we buy it ?",
return_tensors="pt",
pad_to_multiple_of=8,
padding=True,
).to("cuda")
optimize_model(model.encoder)
optimize_model(model.decoder)
with torch.inference_mode(), torch.cuda.amp.autocast(enabled=True, dtype=torch.float16, cache_enabled=Tr\
ue):
output = model.generate(input_ids["input_ids"], min_length=22, max_length=22,)
print(output[0])
print(tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)) displays:
Disabling |
@jonathlela can you share a reproduction at the Simple analysis (local machine, RTX 3090) seems to show that the input of @gaurav21s how did you trained T5? With fp16 or bf16/fp32 ? T5 base output:
T5 large:
|
@jonathlela I have the same issue with optimized t5-base. You mentioned that disabling "replace_layer_norm_rms" fixes the issue. How replace_layer_norm_rms can be disabled? I've tried commenting out this line and re-running the code:
|
T5 weights are in BF16, triton 2.0 does not support fully BF16, we're waiting for the fix to be propagated. |
Hello @jonathlela triton-lang/triton#1306 as the PR is merged, would using most recent openai triton with kernl resolve this issue? |
Hello @jonathlela Would there be other large models we can try with Kernl? It seems like larger version of T5 model type does not work due to this issue. Could we try GPT model? Is the above issue fixed? |
Hi,
When I am using the normal version of the t5 models like 't5-small' or 't5-base'. It is working fine and I am getting the output. But when I tried my customized trained t5 base model for my data. After the kernl optimization, the output is blank.
Can you please look into that or do you have any idea about that?
The text was updated successfully, but these errors were encountered: