RapidDoc - 高速文档解析系统

English | 中文

😺 项目介绍

RapidDoc 是一个轻量级、专注于文档解析的开源框架，支持 OCR、版面分析、公式识别、表格识别和阅读顺序恢复等多种功能，支持将复杂 PDF 文档转换为 Markdown、JSON、WORD、HTML 格式。

支持docx/doc、pptx/ppt、xlsx/xls的原生解析（不使用模型）。

框架基于 Mineru 二次开发，移除 VLM，专注于 Pipeline 产线下的高效文档解析，在 CPU 上也能保持不错的解析速度。

本项目所使用的默认模型主要来源于 PaddleOCR 的 PP-StructureV3 系列（OCR、版面分析、公式识别、阅读顺序恢复，以及部分表格识别模型），并已全部转换为 ONNX 格式，支持在 CPU/GPU 上高效推理。

同时支持自定义OCR、公式、表格模型，需实现 CustomBaseModel 的 batch_predict 方法，目前内置 PaddleOCRVL 系列模型的集成。

KittyDoc 已经成为 RapidAI 开源家族成员

✨如果该项目对您有帮助，您的star是我不断优化的动力！！！

github点击前往

gitee点击前往

👏 项目特点

OCR 识别
- 使用 RapidOCR 支持多种推理引擎
- CPU 下默认使用 OpenVINO（速度快，内存占用较高），GPU 下默认使用 torch
版面识别
- 模型使用 PP-DocLayout 系列 ONNX 模型（v2、plus-L、L、M、S）
  - PP-DocLayoutV3：自带阅读顺序，支持异形框，默认使用
  - PP-DocLayoutV2：自带阅读顺序
  - PP-DocLayout_plus-L：效果好运行稳定
  - PP-DocLayout-L：速度快，效果也不错
  - PP-DocLayout-S：速度极快，存在部分漏检
公式识别
- 使用 PP-FormulaNet_plus 系列 ONNX 模型（L、M、S）
  - PP-FormulaNet_plus-L：速度慢，支持onnx
  - PP-FormulaNet_plus-M：默认使用，支持onnx和torch
  - PP-FormulaNet_plus-S：速度最快，支持onnx，复杂公式精度不够
- 支持配置只识别行间公式
- cuda环境，默认使用torch推理，公式模型onnx gpu推理会报错，暂时无人解决 PaddleOCR/issues/15125, PaddleX/issues/4238, Paddle2ONNX/issues/1593
表格识别
- 基于 rapid_table_self 增强，在原有基础上增强为多模型串联方案：
  - 表格分类（区分有线/无线表格）
  - 有线表格识别UNET + SLANET_plus/UNITABLE（作为无线表格识别）
阅读顺序恢复
- PP-DocLayoutV2和PP-DocLayoutV3使用版面模型自带的阅读顺序
- 其余版面模型，使用 PP-StructureV3 阅读顺序恢复算法，基于xycut算法和版面的结果
推理方式
- 所有模型通过 ONNXRuntime 推理，OCR可配置其他推理引擎
- 除了 OCR 和 PP-DocLayout-M/S 模型，OpenVINO推理会报错，暂时难以解决。PaddleOCR/issues/16277

基准测试结果

1. OmniDocBench

以下是RapidDoc在 OmniDocBench v1.6 上的评估结果。

Pipeline 模型使用 PP-DocLayoutV3、PP-OCRv6-small、PP-FormulaNet_plus-M、UNET_SLANET_PLUS。

Comprehensive evaluation of document parsing on OmniDocBench (v1.6_full)

Model Type	Methods	Size	Overall↑	Text^Edit↓	Formula^CDM↑	Table^TEDS↑	Table^TEDS-S↑	Read Order^Edit↓
MinerU2.5-Pro	Specialized VLMs	1.2B	95.75	0.036	97.45	93.42	95.92	0.120
GLM-OCR	Specialized VLMs	0.9B	95.22	0.044	97.18	92.83	95.39	0.133
PaddleOCR-VL-1.5	Specialized VLMs	0.9B	94.93	0.038	96.89	91.67	94.37	0.130
PaddleOCR-VL	Specialized VLMs	0.9B	94.18	0.040	95.91	90.65	93.74	0.135
Youtu-Parsing	Specialized VLMs	2.5B	93.74	0.044	93.63	92.02	95.00	0.116
Qianfan-OCR	Specialized VLMs	4B	93.90	0.04	95.08	90.53	93.31	0.13
Ovis2.6-30B-A3B	General VLMs	30B	93.70	0.035	95.17	89.44	92.40	0.135
Logics-Parsing-v2	Specialized VLMs	4B	93.33	0.041	95.65	88.42	91.98	0.137
ABot-OCR	Specialized VLMs	2B	93.30	0.037	94.86	88.69	91.87	0.137
FireRed-OCR	Specialized VLMs	2B	93.26	0.037	95.44	88.04	91.06	0.131
MinerU-2.5	Specialized VLMs	1.2B	93.04	0.045	95.77	87.88	91.47	0.130
Gemini 3 Pro	General VLMs	-	92.91	0.064	95.99	89.15	92.96	0.165
Gemini 3 Flash	General VLMs	-	92.62	0.066	95.16	89.29	93.51	0.172
dots.ocr	Specialized VLMs	3B	90.77	0.048	89.95	87.18	90.58	0.138
OpenDoc-0.1B	Specialized VLMs	0.1B	90.67	0.049	93.02	83.88	87.45	0.140
DeepSeek-OCR 2	Specialized VLMs	3B	90.25	0.050	91.84	83.89	87.75	0.144
RapidDoc	Pipeline Tools	-	90.157	0.047	93.777	81.394	88.402	0.136
HunyuanOCR	Specialized VLMs	1B	89.95	0.088	87.68	91.01	93.23	0.171
Qwen3-VL-235B	General VLMs	235B	89.78	0.063	92.55	83.07	86.75	0.166
Dolphin-v2	Specialized VLMs	3B	89.50	0.069	91.01	84.40	87.44	0.150
OCRVerse	Specialized VLMs	4B	88.60	0.063	89.61	82.44	86.27	0.163
MonkeyOCR-pro-3B	Specialized VLMs	3B	88.57	0.074	88.74	84.35	88.62	0.189
GPT-5.2	General VLMs	-	86.59	0.114	88.21	82.95	87.93	0.193
Dolphin-1.5	Specialized VLMs	0.3B	86.52	0.094	87.49	81.43	84.82	0.167
MinerU-Pipeline	Pipeline Tools	-	86.47	0.055	83.07	81.88	88.68	0.153
olmOCR	Specialized VLMs	7B	85.74	0.139	88.10	83.00	87.17	0.216
Mistral OCR	Specialized VLMs	-	85.66	0.097	89.91	76.78	80.93	0.171
Kimi K2.5	General VLMs	1T	84.53	0.107	83.50	80.76	84.00	0.211
InternVL3.5-241B	General VLMs	241B	83.76	0.130	89.95	74.35	79.78	0.215
Nanonets-OCR-s	Specialized VLMs	3B	83.61	0.108	81.46	80.18	84.51	0.213
POINTS-Reader	Specialized VLMs	3B	83.37	0.096	85.72	73.98	77.40	0.198
Marker	Pipeline Tools	-	78.44	0.157	85.24	65.77	73.24	0.243

🛠️ 安装RapidDoc

使用pip安装

pip install rapid-doc[cpu] -i https://mirrors.aliyun.com/pypi/simple
或
pip install rapid-doc[gpu] -i https://mirrors.aliyun.com/pypi/simple

通过源码安装

# 克隆仓库
git clone https://github.com/RapidAI/RapidDoc.git
cd RapidDoc

# 安装依赖
pip install -e .[cpu] -i https://mirrors.aliyun.com/pypi/simple
或
pip install -e .[gpu] -i https://mirrors.aliyun.com/pypi/simple

使用gpu推理

# rapid-doc[gpu] 默认安装 onnxruntime-gpu 最新版
# 需要确定onnxruntime-gpu与GPU对应，参考 https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements

# 在 Python 中指定 GPU（必须在导入 rapid_doc 之前设置）
import os
# 使用默认 GPU（cuda:0）
os.environ['MINERU_DEVICE_MODE'] = "cuda"
# 或指定 GPU 编号，例如使用第二块 GPU（cuda:1）
os.environ['MINERU_DEVICE_MODE'] = "cuda:1"

使用PaddleOCRVL系列推理

vl模型的部署，参考官方文档

import os
os.environ['PADDLEOCRVL_VERSION'] = "v1.6"
os.environ['PADDLEOCRVL_VL_REC_BACKEND'] = "vllm-server"
os.environ['PADDLEOCRVL_VL_VL_REC_SERVER_URL'] = "http://localhost:8118/v1"

from rapid_doc.model.layout.rapid_layout_self import ModelType as LayoutModelType
from rapid_doc.model.custom.paddleocr_vl.paddleocr_vl import PaddleOCRVLTableModel, PaddleOCRVLOCRModel, PaddleOCRVLFormulaModel
layout_config = {
    "model_type": LayoutModelType.PP_DOCLAYOUTV3,
}
ocr_config = {
    "custom_model": PaddleOCRVLOCRModel()
}
formula_config = {
    "custom_model": PaddleOCRVLFormulaModel()
}
table_config = {
    "custom_model": PaddleOCRVLTableModel()
}

使用docker部署RapidDoc

RapidDoc提供了便捷的docker部署方式，这有助于快速搭建环境并解决一些棘手的环境兼容问题。

您可以在文档中获取 Docker部署说明，镜像已推送至 Docker Hub。

📋 使用

import os
from pathlib import Path
from rapid_doc import RapidDoc
__dir__ = Path(__file__).resolve().parent.parent
output_dir = os.path.join(__dir__, "output")

doc_path_list = [
    __dir__ / "demo/pdfs/示例1-论文模板.pdf",
    __dir__ / "demo/docx/test.docx",
]
engine = RapidDoc()
outputs = engine(doc_path_list, output_dir=output_dir)
for output in outputs:
    print(output.markdown)

在线体验

基于Gradio的在线demo

基于gradio开发的webui，界面简洁，仅包含核心解析功能，免登录

📋 使用示例

模型下载

不指定模型路径，初次运行时，会自动下载

📌 TODO

🙏 致谢

Star History

⚖️ 开源许可

基于 MinerU 改造而来，已移除原项目中的 YOLO 模型，并替换为 PP-StructureV3 系列 ONNX 模型。
由于已移除 AGPL 授权的 YOLO 模型部分，本项目整体不再受 AGPL 约束。

该项目采用 Apache 2.0 license 开源许可证。

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
.github/workflows		.github/workflows
chunker		chunker
demo		demo
docker		docker
docs		docs
rapid_doc		rapid_doc
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_EN.md		README_EN.md
changelog-mineru.txt		changelog-mineru.txt
demo.py		demo.py
magic.json		magic.json
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RapidDoc - 高速文档解析系统

😺 项目介绍

👏 项目特点

基准测试结果

1. OmniDocBench

🛠️ 安装RapidDoc

使用pip安装

通过源码安装

使用gpu推理

使用PaddleOCRVL系列推理

使用docker部署RapidDoc

📋 使用

在线体验

基于Gradio的在线demo

📋 使用示例

模型下载

📌 TODO

🙏 致谢

Star History

⚖️ 开源许可

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

RapidDoc - 高速文档解析系统

😺 项目介绍

👏 项目特点

基准测试结果

1. OmniDocBench

🛠️ 安装RapidDoc

使用pip安装

通过源码安装

使用gpu推理

使用PaddleOCRVL系列推理

使用docker部署RapidDoc

📋 使用

在线体验

基于Gradio的在线demo

📋 使用示例

模型下载

📌 TODO

🙏 致谢

Star History

⚖️ 开源许可

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages