Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VITS] Fix nightly tests #25986

Merged
merged 8 commits into from
Sep 7, 2023
Merged

Conversation

sanchit-gandhi
Copy link
Contributor

@sanchit-gandhi sanchit-gandhi commented Sep 5, 2023

What does this PR do?

Fixes the tokenizer integration test and multi-GPU test that failed on the nightly run: https://github.com/huggingface/transformers/actions/runs/6068405445/job/16461410319

The tokenizer fix is trivial (needed to update the commit ID)!

The multi-GPU test was failing because the output sequence length for VITS is a function of the model inputs, rather than being a function of the input sequence lengths only.

Let's say we have 2 GPUs over which we want to run DP:

  • GPU 1 outputs a sequence length of N, which is computed based on the input in the first element of the batch x
  • GPU 2 outputs a sequence length of M, which is computed based on the input in the second element of the batch y

=> there is nothing to enforce that N = M, since the VITS output sequence length is a function of the inputs. Thus, we cannot concatenate the inputs after running the forward pass, since they have different dims.

# pseudo code for data parallelism
input_1, input_2 = torch.split(input, 2, dim=0)

output_1 = model(input_1) 
output_2 = model(input_2)

output = torch.concatenate([output_1, output_2], dim=0)  # breaks because input_1 and input_2 have different sequence lengths

The fix for the test is to pass the same inputs to both GPUs, and disable the stochastic duration predictor. This way, we get consistent outputs across our GPUs.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 5, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. Just a few nits for the comments.

tests/models/vits/test_modeling_vits.py Outdated Show resolved Hide resolved
tests/models/vits/test_modeling_vits.py Show resolved Hide resolved
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The tokenizers integration tests should rather use k as padding as we discussed!

tests/models/vits/test_tokenization_vits.py Outdated Show resolved Hide resolved
@sanchit-gandhi sanchit-gandhi merged commit 2af87d0 into huggingface:main Sep 7, 2023
3 checks passed
@sanchit-gandhi sanchit-gandhi deleted the vits-tests branch September 7, 2023 16:49
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
* fix tokenizer

* make bs even

* fix multi gpu test

* style

* model forward

* fix torch import

* revert tok pin
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
* fix tokenizer

* make bs even

* fix multi gpu test

* style

* model forward

* fix torch import

* revert tok pin
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 18, 2023
* fix tokenizer

* make bs even

* fix multi gpu test

* style

* model forward

* fix torch import

* revert tok pin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants