# OpenOCR **Repository Path**: sieding/OpenOCR ## Basic Information - **Project Name**: OpenOCR - **Description**: No description available - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-02-01 - **Last Updated**: 2026-02-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
______________________________________________________________________ OpenOCR is an open-source toolkit developed by the OCR team from [FVL Lab](https://fvl.fudan.edu.cn), Fudan University, under the guidance of Prof. [Yu-Gang Jiang](https://scholar.google.com/citations?user=f3_FP8AAAAAJ) and Prof. [Zhineng Chen](https://zhinchenfd.github.io). It focuses on ใGeneral-OCRใ tasks, including **Text Detection and Recognition, Formula and Table Recognition**, as well as **Document Parsing and Understanding**. The toolkit integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers. OpenOCR aims to build a comprehensive open-source ecosystem for General-OCR, bridging academic research and real-world applications, and fostering the collaborative development and widespread deployment of OCR technologies across both research frontiers and industrial scenarios. We welcome researchers, developers, and industry partners to explore the toolkit and share feedback. ## Features - ๐ฅ**OpenDoc-0.1B: Ultra-Lightweight Document Parsing System with 0.1B Parameters** - โก\[[Quick Start](./docs/opendoc.md)\] [](https://huggingface.co/spaces/topdu/OpenDoc-0.1B-Demo) [](https://modelscope.cn/studios/topdktu/OpenDoc-0.1B-Demo) \[[Local Demo](./docs/opendoc.md#local-demo)\] - An ultra-lightweight document parsing system with only 0.1B parameters. - Two-stage pipeline: 1. Layout analysis via **[PP-DocLayoutV2](https://www.paddleocr.ai/latest/version3.x/module_usage/layout_analysis.html)**. 2. Unified recognition of text, formulas, and tables using the in-house model **[UniRec-0.1B](./docs/unirec.md)** - In the original version of **UniRec-0.1B**, only **text and formula recognition** were supported. In **OpenDoc-0.1B**, we **rebuilt UniRec-0.1B** to enable **unified recognition of text, formulas, and tables**. - Supports document parsing for **Chinese and English**. - Achieves **90.57% on [OmniDocBench (v1.5)](https://github.com/opendatalab/OmniDocBench/tree/main?tab=readme-ov-file#end-to-end-evaluation)**, outperforming many document parsing models based on multimodal large language models. - ๐ฅ**UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters** - \[[Doc](./docs/unirec.md)\] [](https://arxiv.org/pdf/2512.21095) [](https://huggingface.co/spaces/topdu/OpenOCR-UniRec-Demo) [](https://modelscope.cn/studios/topdktu/OpenOCR-UniRec-Demo) \[[Local Demo](./docs/unirec.md#local-demo)\] \[[Hugging Face Model](https://huggingface.co/topdu/unirec-0.1b)\] \[[ModelScope Model](https://www.modelscope.cn/models/topdktu/unirec-0.1b)\] \[[UniRec40M Dataset](https://huggingface.co/datasets/topdu/UniRec40M)\] - Recognizing plain text (words, lines, paragraphs), formulas (single-line, multi-line), and mixed text-and-formulas content. - 0.1B parameters. - Trained from scratch on [UniRec40M](https://huggingface.co/datasets/topdu/UniRec40M) data without pre-training. - Supporting both Chinese and English text/formulas recognition. - ๐ฅ**OpenOCR: A general OCR system with accuracy and efficiency** - โก\[[Quick Start](./docs/openocr.md#quick-start)\] [](https://huggingface.co/spaces/topdu/OpenOCR-Demo) [](https://modelscope.cn/studios/topdktu/OpenOCR-Demo) \[[Local Demo](./docs/openocr.md#local-demo)\] \[[Model](https://github.com/Topdu/OpenOCR/releases/tag/develop0.0.1)\] \[[PaddleOCR Implementation](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/text_recognition/algorithm_rec_svtrv2.html)\] - [Introduction](./docs/openocr.md) - A practical OCR system building on SVTRv2. - Outperforms [PP-OCRv4](https://paddlepaddle.github.io/PaddleOCR/latest/ppocr/model_list.html) baseline by 4.5% on the [OCR competition leaderboard](https://aistudio.baidu.com/competition/detail/1131/0/leaderboard) in terms of accuracy, while preserving quite similar inference speed. - [x] Supports Chinese and English text detection and recognition. - [x] Provides server model and mobile model. - [x] Fine-tunes OpenOCR on a custom dataset: [Fine-tuning Det](./docs/finetune_det.md), [Fine-tuning Rec](./docs/finetune_rec.md). - [x] [ONNX model export for wider compatibility](./docs/openocr.md#export-onnx-model). - ๐ฅ**SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition (ICCV 2025)** - \[[Doc](./configs/rec/svtrv2/)\] [](https://arxiv.org/abs/2411.15858) \[[Model](./configs/rec/svtrv2/readme.md#11-models-and-results)\] \[[Datasets](./docs/svtrv2.md#downloading-datasets)\] \[[Config, Training and Inference](./configs/rec/svtrv2/readme.md#3-model-training--evaluation)\] \[[Benchmark](./docs/svtrv2.md#results-benchmark--configs--checkpoints)\] - [Introduction](./docs/svtrv2.md) - A unified training and evaluation benchmark (on top of [Union14M](https://github.com/Mountchicken/Union14M?tab=readme-ov-file#3-union14m-dataset)) for Scene Text Recognition - Supports 24 Scene Text Recognition methods trained from scratch on the large-scale real dataset [Union14M-L-Filter](./docs/svtrv2.md#dataset-details), and will continue to add the latest methods. - Improves accuracy by 20-30% compared to models trained based on synthetic datasets. - Towards Arbitrary-Shaped Text Recognition and Language modeling with a Single Visual Model. - Surpasses Attention-based Encoder-Decoder Methods across challenging scenarios in terms of accuracy and speed - [Get Started](./docs/svtrv2.md#get-started-with-training-a-sota-scene-text-recognition-model-from-scratch) with training a SOTA Scene Text Recognition model from scratch. ## Ours OCR algorithms - [**UniRec-0.1B**](./docs/unirec.md) (*Yongkun Du, Zhineng Chen, Yazhen Xie, Weikang Bai, Hao Feng, Wei Shi, Yuchen Su, Can Huang, Yu-Gang Jiang. UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters,* Preprint. [Doc](./configs/rec/unirec/), [Paper](https://arxiv.org/pdf/2512.21095)) - [**MDiff4STR**](./configs/rec/mdiff4str/) (*Yongkun Du, Miaomiao Zhao, Songlin Fan, Zhineng Chen\*, Caiyan Jia, Yu-Gang Jiang. MDiff4STR: Mask Diffusion Model for Scene Text Recognition,* AAAI 2026 Oral. [Doc](./configs/rec/mdiff4str/), [Paper](https://arxiv.org/abs/2512.01422)) - [**CMER**](./configs/rec/cmer/) (*Weikang Bai, Yongkun Du, Yuchen Su, Yazhen Xie, Zhineng Chen\*. Complex Mathematical Expression Recognition: Benchmark, Large-Scale Dataset and Strong Baseline,* AAAI 2026. [Doc](./configs/rec/cmer/), [Paper](https://arxiv.org/abs/2512.13731)) - **TextSSR** (*Xingsong Ye, Yongkun Du, Yunbo Tao, Zhineng Chen\*. TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition,* ICCV 2025. [Paper](https://openaccess.thecvf.com/content/ICCV2025/papers/Ye_TextSSR_Diffusion-based_Data_Synthesis_for_Scene_Text_Recognition_ICCV_2025_paper.pdf), [Code](https://github.com/YesianRohn/TextSSR)) - [**SVTRv2**](./configs/rec/svtrv2) (*Yongkun Du, Zhineng Chen\*, Hongtao Xie, Caiyan Jia, Yu-Gang Jiang. SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition,* ICCV 2025. [Doc](./configs/rec/svtrv2/), [Paper](https://openaccess.thecvf.com/content/ICCV2025/html/Du_SVTRv2_CTC_Beats_Encoder-Decoder_Models_in_Scene_Text_Recognition_ICCV_2025_paper.html)) - [**IGTR**](./configs/rec/igtr/) (*Yongkun Du, Zhineng Chen\*, Yuchen Su, Caiyan Jia, Yu-Gang Jiang. Instruction-Guided Scene Text Recognition,* TPAMI 2025. [Doc](./configs/rec/igtr), [Paper](https://ieeexplore.ieee.org/document/10820836)) - [**CPPD**](./configs/rec/cppd/) (*Yongkun Du, Zhineng Chen\*, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang. Context Perception Parallel Decoder for Scene Text Recognition,* TPAMI 2025. [PaddleOCR Doc](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/text_recognition/algorithm_rec_cppd.en.md), [Paper](https://ieeexplore.ieee.org/document/10902187)) - [**SMTR&FocalSVTR**](./configs/rec/smtr/) (*Yongkun Du, Zhineng Chen\*, Caiyan Jia, Xieping Gao, Yu-Gang Jiang. Out of Length Text Recognition with Sub-String Matching,* AAAI 2025. [Doc](./configs/rec/smtr/), [Paper](https://ojs.aaai.org/index.php/AAAI/article/view/32285)) - [**DPTR**](./configs/rec/dptr/) (*Shuai Zhao, Yongkun Du, Zhineng Chen\*, Yu-Gang Jiang. Decoder Pre-Training with only Text for Scene Text Recognition,* ACM MM 2024. [Paper](https://dl.acm.org/doi/10.1145/3664647.3681390)) - [**CDistNet**](./configs/rec/cdistnet/) (*Tianlun Zheng, Zhineng Chen\*, Shancheng Fang, Hongtao Xie, Yu-Gang Jiang. CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition,* IJCV 2024. [Paper](https://link.springer.com/article/10.1007/s11263-023-01880-0)) - **MRN** (*Tianlun Zheng, Zhineng Chen\*, Bingchen Huang, Wei Zhang, Yu-Gang Jiang. MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition,* ICCV 2023. [Paper](https://openaccess.thecvf.com/content/ICCV2023/html/Zheng_MRN_Multiplexed_Routing_Network_for_Incremental_Multilingual_Text_Recognition_ICCV_2023_paper.html), [Code](https://github.com/simplify23/MRN)) - **TPS++** (*Tianlun Zheng, Zhineng Chen\*, Jinfeng Bai, Hongtao Xie, Yu-Gang Jiang. TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition,* IJCAI 2023. [Paper](https://arxiv.org/abs/2305.05322), [Code](https://github.com/simplify23/TPS_PP)) - [**SVTR**](./configs/rec/svtr/) (*Yongkun Du, Zhineng Chen\*, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, Yu-Gang Jiang. SVTR: Scene Text Recognition with a Single Visual Model,* IJCAI 2022 (Long). [PaddleOCR Doc](https://github.com/Topdu/PaddleOCR/blob/main/doc/doc_ch/algorithm_rec_svtr.md), [Paper](https://www.ijcai.org/proceedings/2022/124)) - [**NRTR**](./configs/rec/nrtr/) (*Fenfen Sheng, Zhineng Chen, Bo Xu. NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition,* ICDAR 2019. [Paper](https://arxiv.org/abs/1806.00926)) ## Recent Updates - **2026.01.13**: ๐ฅ Releasing [CMER](./configs/rec/cmer/) code and [MER-17M](https://huggingface.co/datasets/topdu/MER-17M) dataset. - **2026.01.07**: ๐ฅ Releasing [UniRec40M](https://huggingface.co/datasets/topdu/UniRec40M) dataset, which includes 40 million instances of recognition data comprising text, formulas, and text-formula mixed content. - **2025.12.25**: ๐ฅ Releasing [OpenDoc-0.1B](./docs/opendoc.md): Ultra-Lightweight Document Parsing System with 0.1B Parameters - **2025.11.08**: Our paper [MDiff4STR](https://arxiv.org/abs/2512.01422) is accepted by AAAI 2026 (Oral). Accessible in [Doc](./configs/rec/mdiff4str/). - **2025.11.08**: Our paper [CMER](https://arxiv.org/abs/2512.13731) is accepted by AAAI 2026. Accessible in [Doc](./configs/rec/cmer/). - **2025.08.20**: ๐ฅ Releasing [UniRec-0.1B](https://arxiv.org/pdf/2512.21095): Unified Text and Formula Recognition with 0.1B Parameters - **2025.07.10**: Our paper [SVTRv2](https://openaccess.thecvf.com/content/ICCV2025/html/Du_SVTRv2_CTC_Beats_Encoder-Decoder_Models_in_Scene_Text_Recognition_ICCV_2025_paper.html) is accepted by ICCV 2025. Accessible in [Doc](./configs/rec/svtrv2/). - **2025.07.10**: Our paper [TextSSR](https://openaccess.thecvf.com/content/ICCV2025/papers/Ye_TextSSR_Diffusion-based_Data_Synthesis_for_Scene_Text_Recognition_ICCV_2025_paper.pdf) is accepted by ICCV 2025. Accessible in [Code](https://github.com/YesianRohn/TextSSR). - **2025.03.24**: ๐ฅ Releasing the feature of fine-tuning OpenOCR on a custom dataset: [Fine-tuning Det](./docs/finetune_det.md), [Fine-tuning Rec](./docs/finetune_rec.md) - **2025.03.23**: ๐ฅ Releasing the feature of [ONNX model export for wider compatibility](#export-onnx-model). - **2025.02.22**: Our paper [CPPD](https://ieeexplore.ieee.org/document/10902187) is accepted by TPAMI. Accessible in [Doc](./configs/rec/cppd/) and [PaddleOCR Doc](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/text_recognition/algorithm_rec_cppd.en.md). - **2024.12.31**: Our paper [IGTR](https://ieeexplore.ieee.org/document/10820836) is accepted by TPAMI. Accessible in [Doc](./configs/rec/igtr/). - **2024.12.16**: Our paper [SMTR](https://ojs.aaai.org/index.php/AAAI/article/view/32285) is accepted by AAAI 2025. Accessible in [Doc](./configs/rec/smtr/). - **2024.12.03**: The pre-training code for [DPTR](https://dl.acm.org/doi/10.1145/3664647.3681390) is merged. - **๐ฅ 2024.11.23 release notes**: - **OpenOCR: A general OCR system with accuracy and efficiency** - โก\[[Quick Start](./docs/openocr.md#quick-start)\] \[[Model](https://github.com/Topdu/OpenOCR/releases/tag/develop0.0.1)\] \[[ModelScope Demo](https://modelscope.cn/studios/topdktu/OpenOCR-Demo)\] \[[Hugging Face Demo](https://huggingface.co/spaces/topdu/OpenOCR-Demo)\] \[[Local Demo](./docs/openocr.md#local-demo)\] \[[PaddleOCR Implementation](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/text_recognition/algorithm_rec_svtrv2.html)\] - [Introduction](./docs/openocr.md) - **SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition** - \[[Paper](https://openaccess.thecvf.com/content/ICCV2025/html/Du_SVTRv2_CTC_Beats_Encoder-Decoder_Models_in_Scene_Text_Recognition_ICCV_2025_paper.html)\] \[[Doc](./configs/rec/svtrv2/)\] \[[Model](./configs/rec/svtrv2/readme.md#11-models-and-results)\] \[[Datasets](./docs/svtrv2.md#downloading-datasets)\] \[[Config, Training and Inference](./configs/rec/svtrv2/readme.md#3-model-training--evaluation)\] \[[Benchmark](./docs/svtrv2.md#results--configs--checkpoints)\] - [Introduction](./docs/svtrv2.md) - [Get Started](./docs/svtrv2.md#get-started-with-training-a-sota-scene-text-recognition-model-from-scratch) with training a SOTA Scene Text Recognition model from scratch. ## Reproduction schedule: ### Scene Text Recognition | Method | Venue | Training | Evaluation | Contributor | | --------------------------------------------- | ---------------------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------- | | [CRNN](./configs/rec/svtrs/) | [TPAMI 2016](https://arxiv.org/abs/1507.05717) | โ | โ | | | [ASTER](./configs/rec/aster/) | [TPAMI 2019](https://ieeexplore.ieee.org/document/8395027) | โ | โ | [pretto0](https://github.com/pretto0) | | [NRTR](./configs/rec/nrtr/) | [ICDAR 2019](https://arxiv.org/abs/1806.00926) | โ | โ | | | [SAR](./configs/rec/sar/) | [AAAI 2019](https://aaai.org/papers/08610-show-attend-and-read-a-simple-and-strong-baseline-for-irregular-text-recognition/) | โ | โ | [pretto0](https://github.com/pretto0) | | [MORAN](./configs/rec/moran/) | [PR 2019](https://www.sciencedirect.com/science/article/abs/pii/S0031320319300263) | โ | โ | | | [DAN](./configs/rec/dan/) | [AAAI 2020](https://arxiv.org/pdf/1912.10205) | โ | โ | | | [RobustScanner](./configs/rec/robustscanner/) | [ECCV 2020](https://www.ecva.net/papers/eccv_2020/papers_ECCV/html/3160_ECCV_2020_paper.php) | โ | โ | [pretto0](https://github.com/pretto0) | | [AutoSTR](./configs/rec/autostr/) | [ECCV 2020](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123690732.pdf) | โ | โ | | | [SRN](./configs/rec/srn/) | [CVPR 2020](https://openaccess.thecvf.com/content_CVPR_2020/html/Yu_Towards_Accurate_Scene_Text_Recognition_With_Semantic_Reasoning_Networks_CVPR_2020_paper.html) | โ | โ | [pretto0](https://github.com/pretto0) | | [SEED](./configs/rec/seed/) | [CVPR 2020](https://openaccess.thecvf.com/content_CVPR_2020/html/Qiao_SEED_Semantics_Enhanced_Encoder-Decoder_Framework_for_Scene_Text_Recognition_CVPR_2020_paper.html) | โ | โ | | | [ABINet](./configs/rec/abinet/) | [CVPR 2021](https://openaccess.thecvf.com//content/CVPR2021/html/Fang_Read_Like_Humans_Autonomous_Bidirectional_and_Iterative_Language_Modeling_for_CVPR_2021_paper.html) | โ | โ | [YesianRohn](https://github.com/YesianRohn) | | [VisionLAN](./configs/rec/visionlan/) | [ICCV 2021](https://openaccess.thecvf.com/content/ICCV2021/html/Wang_From_Two_to_One_A_New_Scene_Text_Recognizer_With_ICCV_2021_paper.html) | โ | โ | [YesianRohn](https://github.com/YesianRohn) | | PIMNet | [ACM MM 2021](https://dl.acm.org/doi/10.1145/3474085.3475238) | | | TODO | | [SVTR](./configs/rec/svtrs/) | [IJCAI 2022](https://www.ijcai.org/proceedings/2022/124) | โ | โ | | | [PARSeq](./configs/rec/parseq/) | [ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880177.pdf) | โ | โ | | | [MATRN](./configs/rec/matrn/) | [ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880442.pdf) | โ | โ | | | [MGP-STR](./configs/rec/mgpstr/) | [ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880336.pdf) | โ | โ | | | [LPV](./configs/rec/lpv/) | [IJCAI 2023](https://www.ijcai.org/proceedings/2023/0189.pdf) | โ | โ | | | [MAERec](./configs/rec/maerec/)(Union14M) | [ICCV 2023](https://openaccess.thecvf.com/content/ICCV2023/papers/Jiang_Revisiting_Scene_Text_Recognition_A_Data_Perspective_ICCV_2023_paper.pdf) | โ | โ | | | [LISTER](./configs/rec/lister/) | [ICCV 2023](https://openaccess.thecvf.com/content/ICCV2023/papers/Cheng_LISTER_Neighbor_Decoding_for_Length-Insensitive_Scene_Text_Recognition_ICCV_2023_paper.pdf) | โ | โ | | | [CDistNet](./configs/rec/cdistnet/) | [IJCV 2024](https://link.springer.com/article/10.1007/s11263-023-01880-0) | โ | โ | [YesianRohn](https://github.com/YesianRohn) | | [BUSNet](./configs/rec/busnet/) | [AAAI 2024](https://ojs.aaai.org/index.php/AAAI/article/view/28402) | โ | โ | | | DCTC | [AAAI 2024](https://ojs.aaai.org/index.php/AAAI/article/view/28575) | | | TODO | | [CAM](./configs/rec/cam/) | [PR 2024](https://arxiv.org/abs/2402.13643) | โ | โ | | | [OTE](./configs/rec/ote/) | [CVPR 2024](https://openaccess.thecvf.com/content/CVPR2024/html/Xu_OTE_Exploring_Accurate_Scene_Text_Recognition_Using_One_Token_CVPR_2024_paper.html) | โ | โ | | | CFF | [IJCAI 2024](https://arxiv.org/abs/2407.05562) | | | TODO | | [DPTR](./configs/rec/dptr/) | [ACM MM 2024](https://dl.acm.org/doi/10.1145/3664647.3681390) | | | [fd-zs](https://github.com/fd-zs) | | VIPTR | [ACM CIKM 2024](https://arxiv.org/abs/2401.10110) | | | TODO | | [IGTR](./configs/rec/igtr/) | [TPAMI 2025](https://ieeexplore.ieee.org/document/10820836) | โ | โ | | | [SMTR](./configs/rec/smtr/) | [AAAI 2025](https://ojs.aaai.org/index.php/AAAI/article/view/32285) | โ | โ | | | [CPPD](./configs/rec/cppd/) | [TPAMI 2025](https://ieeexplore.ieee.org/document/10902187) | โ | โ | | | [FocalSVTR-CTC](./configs/rec/svtrs/) | [AAAI 2025](https://ojs.aaai.org/index.php/AAAI/article/view/32285) | โ | โ | | | [SVTRv2](./configs/rec/svtrv2/) | [ICCV 2025](https://openaccess.thecvf.com/content/ICCV2025/html/Du_SVTRv2_CTC_Beats_Encoder-Decoder_Models_in_Scene_Text_Recognition_ICCV_2025_paper.html) | โ | โ | | | [ResNet+Trans-CTC](./configs/rec/svtrs/) | | โ | โ | | | [ViT-CTC](./configs/rec/svtrs/) | | โ | โ | | | [MDiff4STR](./configs/rec/mdiff4str/) | [AAAI 2025 Oral](https://arxiv.org/abs/2512.01422) | โ | โ | | ### Scene Text Detection (STD) TODO ### Text Spotting TODO ______________________________________________________________________ ## Citation If you find our method useful for your reserach, please cite: ```bibtex @inproceedings{Du2025SVTRv2, title={SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition}, author={Yongkun Du and Zhineng Chen and Hongtao Xie and Caiyan Jia and Yu-Gang Jiang}, booktitle={ICCV}, year={2025}, pages={20147-20156} } @article{du2025unirec, title={UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters}, author={Yongkun Du and Zhineng Chen and Yazhen Xie and Weikang Bai and Hao Feng and Wei Shi and Yuchen Su and Can Huang and Yu-Gang Jiang}, journal={arXiv preprint arXiv:2512.21095}, year={2025} } ``` # Acknowledgement This codebase is built based on the [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR), [PytorchOCR](https://github.com/WenmuZhou/PytorchOCR), and [MMOCR](https://github.com/open-mmlab/mmocr). Thanks for their awesome work!