[CLAP] Add CLAP to the library #21370

ArthurZucker · 2023-01-30T14:15:51Z

What does this PR do?

Adds CLAP to the HF library cc @younesbelkada

…sformers into HEAD

- `CLAPVision` to `CLAPAudio` - refactor kwargs of audio modules

…sformers into add-clap-model

sanchit-gandhi

Super nice PR! Thanks @ArthurZucker and @younesbelkada for your work here!

src/transformers/models/clap/configuration_clap.py

src/transformers/audio_utils.py

src/transformers/models/clap/modeling_clap.py

ArthurZucker · 2023-02-15T17:01:22Z

Found a few discrepancies with the various variables in the config that are not used. Will finish asap should makes things clearer.

src/transformers/models/clap/feature_extraction_clap.py

ArthurZucker · 2023-02-16T16:15:23Z

Ok I broke the history

…to add-clap-model

* add model like clip * update * text model ok * clap text works * some refactor - `CLAPVision` to `CLAPAudio` - refactor kwargs of audio modules * more refactor * more refactor * more refactor * correct fusion * more refactor * new modules * add basic processor * fixup * remove whisper copioed from * audio logits match * add doc * correct filters mel and add maxlength * style * few fixes * forward passes * fixup * fixup * some clean up * remove mels form the dictionnary * pad after the repeat * update padding when dsmaller * fix padding * style * use swin patch merging * use copied from swin * processor with any tokenizer * more copied from * some clean up * more refactor * fix mel when rand_trunc * style * remove unused imports * update processing * remove image processing tests * add testing fiel * fixmodeling issues * replace with `is_longer` * clap in serialization * more refactor * `make fixup` * make fixup * fix feature extractor * update test feature extractor * `make fixup` * clean up config * more clean up * more cleanup * update tests * refactor tests and inits * removeCLAP vision config * remove CLAP from image procssing auto and dummy vision objects * update inits * style * re order classes in modeling clap * Use roberta tokenizer as the other weights are not open sourced * small cleaup * remove tokenization CLAP * processor tokenizr is roberta * update feature extraction doc * remove vclap from model zero shot * update f_min and f_max to frequency_xx * some changes - fix modeling keys - add `is_longer` in the forward pass - make fixup * make fixup * consistent behavior ebtween rand_crop and fusion * add numpy resize and bilinear and documentation * move resizing to image utils * clean feature extraction * import resize from correct file * resize in image transforms * update * style * style * nit * remove unused arguments form the feature extractor * style * few fixes + make fixup * oops * fix more tests * add zero shot audio classification pipeline * update zeroshot classification pipeline * fixup * fix copies * all CI tests pass * make fixup + fix docs * fix docs * fix docs * update tests pip;eline * update zero shot pipeline * update feature extraction clap * update tokenization auto * use nested simplify * update pipeline tests * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * split in two lines * fixes * refactor * clean up * add integration tests * update config docstring * style * update processor * fix processor test * fix feat extractor tests * update docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix readmes * fix tips * Update src/transformers/models/auto/configuration_auto.py * update doc and remove todo -> properly explained * fix idx and typo * typoe * cleanup config * cleanup tests, styles and doc * ignore docstyle on image transform * add conversion script * remove the `clap` indx in favor of `CLAP` * update __init * nits * Update src/transformers/pipelines/__init__.py * fix bug * clarifiy config * fix copy * fix init * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix model output * fix comment * make fixup * make fixup * rename to `Clap` * replace to `Clap` * replace to `Clap` * repo consistency * again repo-consistency * make fixup * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add config * changes * update conversion * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unused function * update based on code reviews * style * more comments * cleanup * clean up * style * apply suggestions * Empty commit * pipeline will be added in a different PR * update calls to audio utils functions * update pipeline init * style * style * styling again * use pad * fix repo-consistency * update utils and add doc for audio utils * clean up resize by using torch. update inits accordingly * style * CLap's tokenizer is RobertA * add audio utils to internal toctreee * update totctree * style * update documentation and normalize naming accross audio utils and feature extraction clap * style * clean up * update doc and typos * fix doctest * update modelin code, got rid of a lot of reshaping * style on added doc audio utils * update modeling clap * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docstringvariables with CLAP * rename key * update modeling CLAP * update audio utils docstring * update processing clap * fix readmes * fix toctree * udpate configuration clap * fix init * make fixup * fix * fix * update naming * update * update checkpoint path * Apply suggestions from code review * Major refactoring * Update src/transformers/models/clap/configuration_clap.py * merge --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

ArthurZucker and others added 30 commits January 30, 2023 14:14

add model like clip

c6b06b5

update

7547c82

text model ok

23c56ac

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

405e653

…sformers into HEAD

clap text works

1771782

some refactor

9b276aa

- `CLAPVision` to `CLAPAudio` - refactor kwargs of audio modules

more refactor

45c36ba

more refactor

abee382

more refactor

a7219ec

correct fusion

c553315

more refactor

4360623

new modules

e3aff6f

add basic processor

00eb73b

fixup

45e7ce9

remove whisper copioed from

b5c483f

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

4084369

…sformers into add-clap-model

audio logits match

fc0d323

add doc

8a27723

correct filters mel and add maxlength

27f133f

style

5ddc2f3

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

fbb6124

…sformers into add-clap-model

few fixes

815c5ce

forward passes

95c400b

fixup

a41ff1a

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

38f7fe8

…sformers into add-clap-model

fixup

275633c

some clean up

1b3a820

remove mels form the dictionnary

8fed2d0

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

2684c70

…sformers into add-clap-model

pad after the repeat

6b9051c

younesbelkada and others added 10 commits February 14, 2023 10:08

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

c4cdd97

…sformers into add-clap-model

fix toctree

de162eb

udpate configuration clap

83d0716

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

0a25e75

…sformers into add-clap-model

fix init

230b516

make fixup

f379031

fix

fe1fbe3

fix

75171e3

update naming

2fba86b

Merge branch 'add-clap-model' of https://github.com/ArthurZucker/tran…

d75a5f8

…sformers into add-clap-model

ArthurZucker added the New model label Feb 14, 2023

ArthurZucker mentioned this pull request Feb 14, 2023

[Pipeline] Add zero shot audio classificatoin pipeline #21600

Merged

younesbelkada requested a review from sgugger February 15, 2023 08:11

sanchit-gandhi approved these changes Feb 15, 2023

View reviewed changes

ArthurZucker commented Feb 15, 2023

View reviewed changes

src/transformers/models/clap/modeling_clap.py Outdated Show resolved Hide resolved

ArthurZucker commented Feb 15, 2023

View reviewed changes

src/transformers/models/clap/modeling_clap.py Outdated Show resolved Hide resolved

ArthurZucker commented Feb 16, 2023

View reviewed changes

ArthurZucker and others added 7 commits February 16, 2023 16:22

update

c221e1d

update checkpoint path

25610ce

Apply suggestions from code review

6856ff0

Major refactoring

1ce6363

Update src/transformers/models/clap/configuration_clap.py

a8dc9a4

merge

9b5b252

Merge branch 'main' of https://github.com/huggingface/transformers in…

66b7dfe

…to add-clap-model

ArthurZucker force-pushed the add-clap-model branch from e69b04f to 66b7dfe Compare February 16, 2023 16:33

ArthurZucker merged commit c236a62 into huggingface:main Feb 16, 2023

xenova mentioned this pull request Dec 1, 2023

[CLAP] Replace hard-coded batch size to enable dynamic ONNX export #27790

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CLAP] Add CLAP to the library #21370

[CLAP] Add CLAP to the library #21370

ArthurZucker commented Jan 30, 2023

sanchit-gandhi left a comment

ArthurZucker commented Feb 15, 2023

ArthurZucker commented Feb 16, 2023

[CLAP] Add CLAP to the library #21370

[CLAP] Add CLAP to the library #21370

Conversation

ArthurZucker commented Jan 30, 2023

What does this PR do?

sanchit-gandhi left a comment

Choose a reason for hiding this comment

ArthurZucker commented Feb 15, 2023

ArthurZucker commented Feb 16, 2023