Agent: add model parameter estimation and dynamic size filtering by luotao1 · Pull Request #719 · PaddlePaddle/GraphNet

luotao1 · 2026-05-20T10:51:57Z

PR Category

Feature Enhancement

Description

问题

超大模型（如 26B+ 参数）在抽取时会因为加载随机权重占用大量内存导致超时。典型案例：

01-ai/Yi-34B（39.7B 参数）：加载 fp32 随机权重需要 ~159GB RAM，在 CPU 上抽取会阻塞数小时甚至永久挂起
原有流程：先下载完整模型文件 → 尝试加载 → 内存不足或超时失败，整个过程可能持续数十分钟到数小时

修改内容

1. 模型参数量预估 (`config_metadata_analyzer.py`)

_estimate_param_count_billion(config)：基于 Transformer 架构的粗估公式
- 覆盖 attention（4×hidden²）、FFN（3×hidden×intermediate，SwiGLU 风格）、embedding（2×vocab×hidden）
- MoE 模型：所有 expert 权重计入总参数量（虽然每 token 只激活部分 expert，但所有权重都需载入内存）
_estimate_oom_risk(config)：基于参数量做 OOM 风险分级
- 7B → high
- 3B → medium
- 同时考虑 context length（>65536 → high，>16384 → medium）和 activation 估计
analyze(max_param_b)：在分析阶段（config.json 解析后）检查参数量
- 超限直接抛出 MetadataAnalysisError，状态记为 EXTRACT_FAILED
- 模型无需加载即可被拒绝，节省大量时间和内存

2. Agent 参数透传 (`graph_net_agent.py`)

构造函数新增 max_model_size_b: float = 20.0
_analyze_model() 透传 max_param_b 给 analyzer

3. CLI 参数 (`parallel_extract.py`)

新增 --max-model-size-b 参数（默认 "auto"）
- "auto"：按 total_RAM(GB) × 0.7 / num_workers / 4 自动计算
- 也可手动指定，如 --max-model-size-b 10 表示上限 10B
启动时打印计算出的上限值，便于确认

验证

参数量估计准确性

from graph_net.agent.metadata_analyzer.config_metadata_analyzer import ConfigMetadataAnalyzer

ConfigMetadataAnalyzer._estimate_param_count_billion(config)

模型	估计参数量	官方参数量
01-ai/Yi-34B	39.7B	34B
sshleifer/tiny-gpt2	0.2B	~0.02B

（粗估公式偏向保守，对超大模型是安全的）

端到端过滤测试

python graph_net/agent/parallel_extract.py \
    --model-list /tmp/test_big_model.txt \
    --workspace /tmp/graphnet_test_big \
    --cpu-workers 1 \
    --max-model-size-b 20

01-ai/Yi-34B（39.7B 参数，limit=20B）：

# 输出
[INFO] max_model_size_b=20.0B (manually set)
[Worker-0 CPU] Extracting: 01-ai/Yi-34B
...
2026-05-20 18:46:47,234 - GraphNetAgent - ERROR - Extraction failed for 01-ai/Yi-34B: 
  Model too large to extract: estimated 39.7B parameters (limit 20.0B). 
  Loading random fp32 weights would require ~159GB RAM.
[Worker-0 CPU] EXTRACT FAILED 01-ai/Yi-34B (8.6s)

耗时仅 12 秒（8.6s 分析 + 拒绝），避免下载 159GB 权重和后续数小时抽取
状态正确记录为 EXTRACT_FAILED，归入失败目录
sshleifer/tiny-gpt2（0.2B 参数，limit=20B）：正常通过分析，成功抽取

效果

超大模型在分析阶段即被拒绝，无需加载权重，从"数小时超时"缩短到"秒级失败"
Worker 资源不再被超大模型阻塞，pipeline 整体吞吐量提升
--max-model-size-b=auto 根据机器配置自适应，避免手动调参

- ConfigMetadataAnalyzer: add _estimate_param_count_billion() for rough param count estimation; analyze() now takes max_param_b to reject oversized models before extraction (prevents timeouts on 26B+ models) - ConfigMetadataAnalyzer: add _estimate_oom_risk() based on param count (>7B→high, >3B→medium) - GraphNetAgent: add max_model_size_b parameter and pass to analyzer - parallel_extract: add --max-model-size-b CLI arg (default 'auto'), auto-calculates from total_RAM × 0.7 / workers / 4

paddle-bot · 2026-05-20T10:52:08Z

Thanks for your contribution!

Xreki approved these changes May 21, 2026

View reviewed changes

luotao1 merged commit 962ca5f into PaddlePaddle:develop May 21, 2026
3 checks passed

luotao1 deleted the param-estimation branch May 21, 2026 02:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent: add model parameter estimation and dynamic size filtering#719

Agent: add model parameter estimation and dynamic size filtering#719
luotao1 merged 1 commit into
PaddlePaddle:developfrom
luotao1:param-estimation

luotao1 commented May 20, 2026

Uh oh!

paddle-bot Bot commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

luotao1 commented May 20, 2026

PR Category

Description

问题

修改内容

1. 模型参数量预估 (config_metadata_analyzer.py)

2. Agent 参数透传 (graph_net_agent.py)

3. CLI 参数 (parallel_extract.py)

验证

参数量估计准确性

端到端过滤测试

效果

Uh oh!

paddle-bot Bot commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. 模型参数量预估 (`config_metadata_analyzer.py`)

2. Agent 参数透传 (`graph_net_agent.py`)

3. CLI 参数 (`parallel_extract.py`)