Skip to content

ARBML/CalliarGen

Repository files navigation

CalliarGen

Word-To-Image: Morphing Arabic Text to a Visual Representation

Huggingface Space Colab Notebook GitHub

Multilingual OpenCLIP
Multilingual OpenCLIP

Dataset

Preprocessing:

  • Extract the caption from the images' names.
  • Remove the numbers in the caption using Maha library.
  • Write the "file_name" and "text" in jsonl file as recommended from HF.
  • Upload the dataset to HF dataset hub.

References:

  • Uploading the dataset to HF hub, here.
  • Dataset in HF hub, here.

Model training:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export dataset_name="lambdalabs/pokemon-blip-captions"

accelerate launch --mixed_precision="fp16"  train_text_to_image.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$dataset_name \
  --use_ema \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --gradient_checkpointing \
  --max_train_steps=15000 \
  --learning_rate=1e-05 \
  --max_grad_norm=1 \
  --lr_scheduler="constant" --lr_warmup_steps=0 \
  --output_dir="calliar_1" 
  • References:
  • The model and the latest checkpoint, here.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages