ABCNet is an efficient end-to-end scene text spotting framework over 10x faster than previous state of the art. It's published in IEEE Conf. Comp Vis Pattern Recogn.'2020 as an oral paper.
ABCNet Chinese demo and the pretrained model can be found here. ABCNet v2 will be released soon.
Name | inf. time | e2e-hmean | det-hmean | download |
---|---|---|---|---|
paper reported | 45.2 | |||
attn_R_50 | 2080ti 8.7 FPS | 53.2 | 84.4 | model |
Name | inf. time | e2e-hmean | det-hmean | download |
---|---|---|---|---|
paper reported | V100 17.9 FPS | 64.2 | ||
tt_attn_R_50 | 2080ti 11.3 FPS | 67.1 | 86.0 | model |
pretrain_attn_R_50 | 2080ti 11.3 FPS | 58.1 | 80.0 | model |
- Select the model and config file above, for example,
configs/BAText/CTW1500/attn_R_50.yaml
. - Run the demo with
wget -O ctw1500_attn_R_50.pth https://universityofadelaide.box.com/shared/static/okeo5pvul5v5rxqh4yg8pcf805tzj2no.pth
python demo/demo.py \
--config-file configs/BAText/CTW1500/attn_R_50.yaml \
--input datasets/CTW1500/ctwtest_text_image/ \
--opts MODEL.WEIGHTS ctw1500_attn_R_50.pth
or
wget -O tt_attn_R_50.pth https://cloudstor.aarnet.edu.au/plus/s/tYsnegjTs13MwwK/download
python demo/demo.py \
--config-file configs/BAText/TotalText/attn_R_50.yaml \
--input datasets/totaltext/test_images/ \
--opts MODEL.WEIGHTS tt_attn_R_50.pth
To train a model with "train_net.py", first setup the corresponding datasets following datasets/README.md or using the following script:
cd datasets/
wget https://universityofadelaide.box.com/shared/static/32p6xsdtu0keu2o6pb5aqhyjotnljxep.zip -O tot.zip
unzip tot.zip
rm tot.zip
wget https://universityofadelaide.box.com/shared/static/6ui89vca7cbp15ysnxqg5r494ix7l6cu.zip -O ctw1500.zip
mkdir CTW1500/ | unzip ctw1500.zip -d CTW1500/
rm ctw1500.zip
mkdir evaluation
cd evaluation
wget -O gt_ctw1500.zip https://cloudstor.aarnet.edu.au/plus/s/xU3yeM3GnidiSTr/download
wget -O gt_totaltext.zip https://cloudstor.aarnet.edu.au/plus/s/SFHvin8BLUM4cNd/download
You can also prepare your custom dataset following the example scripts.
Pretrainining with synthetic data:
OMP_NUM_THREADS=1 python tools/train_net.py \
--config-file configs/BAText/Pretrain/attn_R_50.yaml \
--num-gpus 4 \
OUTPUT_DIR text_pretraining/attn_R_50
Finetuning on Total Text:
OMP_NUM_THREADS=1 python tools/train_net.py \
--config-file configs/BAText/TotalText/attn_R_50.yaml \
--num-gpus 4 \
MODEL.WEIGHTS text_pretraining/attn_R_50/model_final.pth
Finetuning on CTW1500:
OMP_NUM_THREADS=1 python tools/train_net.py \
--config-file configs/BAText/CTW1500/attn_R_50.yaml \
--num-gpus 4 \
MODEL.WEIGHTS text_pretraining/attn_R_50/model_final.pth
Download test GT here so that the directory has the following structure:
datasets
|_ evaluation
| |_ gt_totaltext.zip
| |_ gt_ctw1500.zip
Producing both e2e and detection results on CTW1500:
wget -O ctw1500_attn_R_50.pth https://universityofadelaide.box.com/shared/static/okeo5pvul5v5rxqh4yg8pcf805tzj2no.pth
python tools/train_net.py \
--config-file configs/BAText/CTW1500/attn_R_50.yaml \
--eval-only \
MODEL.WEIGHTS ctw1500_attn_R_50.pth
or Totaltext:
wget -O tt_attn_R_50.pth https://cloudstor.aarnet.edu.au/plus/s/tYsnegjTs13MwwK/download
python tools/train_net.py \
--config-file configs/BAText/TotalText/attn_R_50.yaml \
--eval-only \
MODEL.WEIGHTS tt_attn_R_50.pth
You can also evalute the json result file offline following the evaluation_example_scripts, including an example of how to evaluate on a custom dataset. If you want to measure the inference time, please change --num-gpus to 1.
If you are insteresting in warping a curved instance into a rectangular format independantly, please refer to the example script here.
@inproceedings{liu2020abcnet,
title = {{ABCNet}: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network},
author = {Liu, Yuliang and Chen, Hao and Shen, Chunhua and He, Tong and Jin, Lianwen and Wang, Liangwei},
booktitle = {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}