Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ __pycache__/

# Local runtime config (use template file for repo)
crawler_config.json
crawler_verse.json

# Local temporary configs
tmp_*.json
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,14 @@

## 功能特性

- 支持抓取常见文档站(VitePress / Docusaurus 结构)
- 支持抓取常见文档站(VitePress / Docusaurus / MkDocs 结构)
- 支持多文档根路径自动识别(如 `/guide`、`/components`)
- 支持“根路径 404,但叶子文档可访问”的站点结构
- 导航结构尽量贴近源站层级(一级/二级)
- 侧边栏标题自动中文化(覆盖常见技术术语)
- 详细日志与保活输出(当前阶段、URL、进度、速率、ETA)
- 一键预览本地 HTML 文档站
- 一键预览本地 HTML 文档站(支持暗色/亮色模式)
- 支持将抓取结果导出为单个自包含 HTML 文件
- 输出标准 Markdown 目录,便于后续二次加工

---
Expand Down Expand Up @@ -74,8 +76,10 @@ run_crawler.bat
菜单:

- `1` 抓取并构建站点
- `2` 预览本地站点
- `2` 预览本地站点(含暗色模式切换)
- `3` 查看状态
- `4` 导出单文件 HTML
- `5` 切换配置文件(`crawler*.json`)
- `0` 退出

---
Expand Down
Loading