How to calculate the image_text_similarity scores for both Chinese and English? #473

weiaicunzai · 2024-11-05T07:04:28Z

Thank you for your excellent work.

Regarding my dataset, which includes both English and Chinese samples, I am wondering how I can simultaneously calculate the similarity scores between image and text pairs for both languages.

HYLcool · 2024-11-14T08:27:42Z

Hi @weiaicunzai , thanks for your attention on Data-Juicer~

We use CLIP as the default model to calculate the embeddings of image-text pairs, which works fine on English corpus but not on Chinese texts (ref openai/CLIP#7). For Chinese texts, models like Chinese-CLIP might perform better.

So there is a possible way to do so is to split the datasets into two subsets in English and Chinese with our dedicated dataset_split_by_language tool, and then deploy different models for the image_text_similarity_filter OP to handle them respectively.

weiaicunzai added the question Further information is requested label Nov 5, 2024

drcege assigned HYLcool Nov 11, 2024

HYLcool added dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs labels Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to calculate the image_text_similarity scores for both Chinese and English? #473

How to calculate the image_text_similarity scores for both Chinese and English? #473

weiaicunzai commented Nov 5, 2024 •

edited

Loading

HYLcool commented Nov 14, 2024

How to calculate the image_text_similarity scores for both Chinese and English? #473

How to calculate the image_text_similarity scores for both Chinese and English? #473

Comments

weiaicunzai commented Nov 5, 2024 • edited Loading

HYLcool commented Nov 14, 2024

weiaicunzai commented Nov 5, 2024 •

edited

Loading