Fine Tuning with Arabic #33

Mahmuod1 · 2022-08-21T08:49:05Z

First I would to thank you for this repo
i want to work in Arabic lang and Arabic lang and the Arabic Lang is RTL
could you tell me a pref to the changes i would make when adding the Arabic Lang in the SynthDoG to create the Arabic dataset
and in the model creation

Mahmuod1 · 2022-08-23T13:05:37Z

@gwkrsrch please any help

gwkrsrch · 2022-08-24T09:47:17Z

Hi @Mahmuod1 , there are several options you can take. You may modify the layout/textbox generation module to make the desired RTL layout. There would be several code lines to modify, e.g., textbox, layouts.
Another option is to generate the data with your own code based on SynthDoG. The followings are the main flow of the preliminary version of SynthDoG. The first step is to draw texts on a paper texture image. The following links would also be helpful to you.

And then, using a perspective transformation (or other transformations), you can embed the synthetic paper into a background. Although the idea is simple, you will see some agreeable results. You may further enhance the quality of the generated samples via various techniques, but it is optional. Hope this helps :) Feel free to reopen this or open another issue if you have anything new for sharing.

Mahmuod1 · 2022-08-24T09:57:34Z

thanks, @gwkrsrch for your detailed instructions
can you please give me instructions for the donut model configuration that will be changed as it language specific
I will use the document parsing training so please can you tell me what should care about the training configurations

gwkrsrch · 2022-08-25T02:00:19Z

As a general tip, to train a model for a new language, you need to care about the token vocabulary/tokenizer. #11 would be useful to you :)

akashlp27 · 2023-10-05T06:25:06Z

Hi, @Mahmuod1, @gwkrsrch were you able to generate images using synthdog for RTL languages such as arabic.... any suggestions will help a lot..

Mahmuod1 changed the title ~~Fine Tuning with arabic Arabic~~ Fine Tuning with Arabic Aug 21, 2022

gwkrsrch closed this as completed Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine Tuning with Arabic #33

Fine Tuning with Arabic #33

Mahmuod1 commented Aug 21, 2022

Mahmuod1 commented Aug 23, 2022

gwkrsrch commented Aug 24, 2022

Mahmuod1 commented Aug 24, 2022

gwkrsrch commented Aug 25, 2022

akashlp27 commented Oct 5, 2023 •

edited

Loading

Fine Tuning with Arabic #33

Fine Tuning with Arabic #33

Comments

Mahmuod1 commented Aug 21, 2022

Mahmuod1 commented Aug 23, 2022

gwkrsrch commented Aug 24, 2022

Mahmuod1 commented Aug 24, 2022

gwkrsrch commented Aug 25, 2022

akashlp27 commented Oct 5, 2023 • edited Loading

akashlp27 commented Oct 5, 2023 •

edited

Loading