-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tensor flow whisper model for audio classification #22109
Add tensor flow whisper model for audio classification #22109
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
…del in tensorflow
cd0aa86
to
246f2d2
Compare
…dd_TensorFlow_Whisper_model_for_audio_classification
I just had a few questions on how to proceed with adding the TensorFlow Whisper model, just to make sure I'm on the right track. (1) Just so that I am clear on what the task is asking for, I need to recreate what is being done in PR #21754, except in TensorFlow. So, more specifically recreate the WhisperForAudioClassification class in TensorFlow, within the modeling_tf_whisper.py file. (2) I see that there are a lot of additional lines of code within PR #21754 in various files that seem to be "registering" that the Whisper model now supports audio classification. Would I have to add any lines of code similar to this within my PR? Is there any documentation I can take a look at to learn more about this? (or anything that would help me understand more about this task in general) |
Hi @adit299 Thanks for opening this PR - excited to have this implemented in TF! Regarding your questions:
|
Super cool @adit299! Feel free to ping us if you have any more questions / queries! More than happy to help with the integration here! |
…dd_TensorFlow_Whisper_model_for_audio_classification
…dd_TensorFlow_Whisper_model_for_audio_classification
…dd_TensorFlow_Whisper_model_for_audio_classification
…dd_TensorFlow_Whisper_model_for_audio_classification
Hello, Just wanted to check in and provide an update. I have finished adding the TFWhisperForAudioClassification class within the modeling_tf_whisper.py file. One question regarding this: (1) Within the modeling_tf_auto.py file I don't see any OrderedDict named TF_MODEL_FOR_AUDIO_CLASSIFICATION_MAPPING_NAMES (or any OrderedDict that is equivalent to the MODEL_FOR_AUDIO_CLASSIFICATION_MAPPING_NAMES present within the modeling_auto.py file). I was wondering where the TFWhisperForAudioClassification class should go within the modeling_tf_auto.py file. I will continue work on developing the model tester, and will post any issues I run into here. |
@adit299 - that's great news on the update! For the auto mapping, if the tensorflow equivalent |
…_audio_classification
By propagate, do you mean just looking at that PR and using the code written for that task as help for this current task? If so, I have already been doing that. If you are referring to some other procedure please do let me know about this as I am not aware. That would certainly help! Questions I had: (1) I noticed that within the Pytorch implementation of the whisper tests, it refers to a class
(2) I was having trouble with translating the test_encoder_outputs method in TensorFlow. Mainly these lines: transformers/tests/models/whisper/test_modeling_tf_whisper.py Lines 963 to 966 in d204aea
Again, a bit confused about what Thanks again for the speedy responses! |
@adit299 By propagate, we mean apply the equivalent changes from the Wav2Vec2 PR to this PR - it won't be a direct copy-paste, but there will be large proportions in common. It's sounds like this is what you're doing, which is great :) With respect to your questions:
Yes, I don't think this class exists yet and you wouldn't have to add this class as part of this PR. Is there anything that should be added for the TF model tests @gante ? In terms of what these classes are doing, the mixin classes group together related functionality e.g. common tests that should be added to all models. For example, TFModelTesterMixin contains tests for the TensorFlow models. This way we can create other classes using a composition of mixins.
|
@amyeroberts there is no generation-specific test mixin for TF. |
…dd_TensorFlow_Whisper_model_for_audio_classification
Looks cool already @adit299! Let us know if you need a hand with the integration or when you'd like a PR review 🤗 |
…dd_TensorFlow_Whisper_model_for_audio_classification
Hey @adit299 - feel free to comment here when this PR is ready for review and we can take a look! Seems to be close to completion |
Hey @sanchit-gandhi, apologies for the delay! Yes, this PR is ready for review. I haven't had much luck in getting some tests to pass however. I appreciate any help you guys can provide by taking a look. |
@adit299 Unfortunately, diving into people's PRs to debug isn't something we can do as it's just not a scalable solution with a repo of this size. If you need help from us, then please share a detailed description of the issue, what you've tried already and ideally highlighting any relevant pieces of code. |
https://github.com/adit299/transformers into Add_TensorFlow_Whisper_model_for_audio_classification
Understandable, @amyeroberts . There are 5 tests failing right now. Here is all the information requested (to the best of my knowledge): FAILED test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_compile_tf_model Error -
What I tried - I suspected it had something to do with: But that doesn't seem to be the case. Maybe the Whisper decoder is being mistakenly invoked? I am just not sure. FAILED test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_hidden_states_output - AssertionError: Lists differ: [30, 16] != [60, 16] Error -
The assertion failing is:
What I tried - Not sure about this one. FAILED test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_pt_tf_model_equivalence - AttributeError: tf_whisper_encoder_17.conv1.weight not found in PyTorch model Error -
What I tried - Not sure about this one as well FAILED test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_resize_token_embeddings - NotImplementedError Error - What I tried - I think this one is out of my control FAILED test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_save_load - TypeError: Exception encountered when calling layer 'tf_whisper_for_audio_classification_20' (type TFWhisperForAudioClassification What I tried - connected to the first error, solving that should solve this. Please do let me know if any other clarification is needed! Apologies for the long post! |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Hi @adit299, thanks for giving more details about debugging the tests and apologies for the delay in my response. I suggest looking through the artefacts from the CI run, specifically failure_long.txt as they will give you a more detailed error message an trackback to help figure out the issues. test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_compile_tf_model test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_hidden_states_output test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_pt_tf_model_equivalence If you create the TF whisper model with pytorch weights, do you get any warnings about weights being randomly initialized? test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_resize_token_embeddings - NotImplementedError This is raised because the model doesn't have a test_modeling_tf_whisper.py::TFWhisperEncoderModelTest::test_save_load From the CI artefacts, it looks like this is failing because of |
…dd_TensorFlow_Whisper_model_for_audio_classification
…dd_TensorFlow_Whisper_model_for_audio_classification
Hello, Apologies for the delay. I am attempting to instantiate an instance of the TFWhisperForAudioClassification model to debug some of the issues I'm having. So, I try to run this:
I end up getting this error:
Which stems from these lines of code: transformers/src/transformers/models/auto/auto_factory.py Lines 701 to 707 in 080a971
When I run a debugger, the problematic statement is:
Just executing I have been trying to see what is causing this, but I'm at a loss. Is this why you suggest creating the model using a test config? Could you show how to do that if it is relevant to avoiding this error? I contemplated increasing the Recursion Depth on my machine (its currently at 1000), but I'm hesitant to think that would solve it. Thanks again for your patience, I realize I'm quite the n00b 😅 |
@@ -439,6 +445,10 @@ | |||
) | |||
|
|||
TF_MODEL_MAPPING = _LazyAutoMapping(CONFIG_MAPPING_NAMES, TF_MODEL_MAPPING_NAMES) | |||
TF_MODEL_FOR_AUDIO_CLASSIFICATION_MAPPING_NAMES = _LazyAutoMapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adit299 The recursion is happening because of this line - the variable is being assigned to itself
TF_MODEL_FOR_AUDIO_CLASSIFICATION_MAPPING_NAMES = _LazyAutoMapping( | |
TF_MODEL_FOR_AUDIO_CLASSIFICATION_MAPPING = _LazyAutoMapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this solves the issue!
Hello, I am currently attempting to resolve the error: Error -
Since this error is the root cause of several of the tests failing. I think the issue is that transformers/src/transformers/models/whisper/modeling_tf_whisper.py Lines 464 to 498 in 50573c6
I believe the
The number of tests failing reduces to 4. Although, obviously, this introduces new errors (I have attached the new errors at the bottom for reference). The pytorch equivalent to this method does not contain the transformers/src/transformers/models/whisper/modeling_whisper.py Lines 654 to 682 in 50573c6
My questions are: (1) Should I attempt to change the TensorFlow or (2) Is there some better way to proceed? Once this is resolved, I am very close to finishing with this pull request. Thanks again for your patience! New Errors:
|
@adit299
I'm going to be away mid-September to mid-October. If you have any other tensorflow specific questions, or questions about the differences between the TF and PT models, please ping @Rocketknight1 in my absence. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Adding support for audio classification within TensorFlow whisper model
Fixes #21777
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sanchit-gandhi