News | Usage | Citation | Acknowledgement
This is the official repo for the papers:DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting
2024.04.25
Update DeepSolo models finetuned on BOVText and DSText video datasets.
2023.06.02
Update the pre-trained and fine-tuned Chinese scene text spotting model (78.3% 1-NED on ICDAR 2019 ReCTS).
2023.05.31
The extension paper (DeepSolo++) is submitted to ArXiv. The code and models will be released soon.
2023.02.28
DeepSolo is accepted by CVPR 2023. 🎉🎉
Relevant Project:
✨ Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation | Code
✨ GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching | Code
DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer | Code
Other applications of ViTAE inlcude: ViTPose | Remote Sensing | Matting | VSA | Video Object Segmentation
See README for DeepSolo and DeepSolo++
If you find DeepSolo helpful, please consider giving this repo a star ⭐ and citing:
@inproceedings{ye2023deepsolo,
title={DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting},
author={Ye, Maoyuan and Zhang, Jing and Zhao, Shanshan and Liu, Juhua and Liu, Tongliang and Du, Bo and Tao, Dacheng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={19348--19357},
year={2023}
}
@article{ye2023deepsolo++,
title={DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting},
author={Ye, Maoyuan and Zhang, Jing and Zhao, Shanshan and Liu, Juhua and Liu, Tongliang and Du, Bo and Tao, Dacheng},
booktitle={arxiv preprint arXiv:2305.19957},
year={2023}
}
This project is based on Adelaidet. For academic use, this project is licensed under the 2-clause BSD License.