Skip to content

Agent: add model parameter estimation and dynamic size filtering#719

Merged
luotao1 merged 1 commit into
PaddlePaddle:developfrom
luotao1:param-estimation
May 21, 2026
Merged

Agent: add model parameter estimation and dynamic size filtering#719
luotao1 merged 1 commit into
PaddlePaddle:developfrom
luotao1:param-estimation

Conversation

@luotao1
Copy link
Copy Markdown
Collaborator

@luotao1 luotao1 commented May 20, 2026

PR Category

Feature Enhancement

Description

问题

超大模型(如 26B+ 参数)在抽取时会因为加载随机权重占用大量内存导致超时。典型案例:

  • 01-ai/Yi-34B(39.7B 参数):加载 fp32 随机权重需要 ~159GB RAM,在 CPU 上抽取会阻塞数小时甚至永久挂起
  • 原有流程:先下载完整模型文件 → 尝试加载 → 内存不足或超时失败,整个过程可能持续数十分钟到数小时

修改内容

1. 模型参数量预估 (config_metadata_analyzer.py)

  • _estimate_param_count_billion(config):基于 Transformer 架构的粗估公式

    • 覆盖 attention(4×hidden²)、FFN(3×hidden×intermediate,SwiGLU 风格)、embedding(2×vocab×hidden)
    • MoE 模型:所有 expert 权重计入总参数量(虽然每 token 只激活部分 expert,但所有权重都需载入内存)
  • _estimate_oom_risk(config):基于参数量做 OOM 风险分级

    • 7B → high

    • 3B → medium

    • 同时考虑 context length(>65536 → high,>16384 → medium)和 activation 估计
  • analyze(max_param_b):在分析阶段(config.json 解析后)检查参数量

    • 超限直接抛出 MetadataAnalysisError,状态记为 EXTRACT_FAILED
    • 模型无需加载即可被拒绝,节省大量时间和内存

2. Agent 参数透传 (graph_net_agent.py)

  • 构造函数新增 max_model_size_b: float = 20.0
  • _analyze_model() 透传 max_param_b 给 analyzer

3. CLI 参数 (parallel_extract.py)

  • 新增 --max-model-size-b 参数(默认 "auto"
    • "auto":按 total_RAM(GB) × 0.7 / num_workers / 4 自动计算
    • 也可手动指定,如 --max-model-size-b 10 表示上限 10B
  • 启动时打印计算出的上限值,便于确认

验证

参数量估计准确性

from graph_net.agent.metadata_analyzer.config_metadata_analyzer import ConfigMetadataAnalyzer

ConfigMetadataAnalyzer._estimate_param_count_billion(config)
模型 估计参数量 官方参数量
01-ai/Yi-34B 39.7B 34B
sshleifer/tiny-gpt2 0.2B ~0.02B

(粗估公式偏向保守,对超大模型是安全的)

端到端过滤测试

python graph_net/agent/parallel_extract.py \
    --model-list /tmp/test_big_model.txt \
    --workspace /tmp/graphnet_test_big \
    --cpu-workers 1 \
    --max-model-size-b 20

01-ai/Yi-34B(39.7B 参数,limit=20B):

# 输出
[INFO] max_model_size_b=20.0B (manually set)
[Worker-0 CPU] Extracting: 01-ai/Yi-34B
...
2026-05-20 18:46:47,234 - GraphNetAgent - ERROR - Extraction failed for 01-ai/Yi-34B: 
  Model too large to extract: estimated 39.7B parameters (limit 20.0B). 
  Loading random fp32 weights would require ~159GB RAM.
[Worker-0 CPU] EXTRACT FAILED 01-ai/Yi-34B (8.6s)
  • 耗时仅 12 秒(8.6s 分析 + 拒绝),避免下载 159GB 权重和后续数小时抽取
  • 状态正确记录为 EXTRACT_FAILED,归入失败目录
  • sshleifer/tiny-gpt2(0.2B 参数,limit=20B): 正常通过分析,成功抽取

效果

  • 超大模型在分析阶段即被拒绝,无需加载权重,从"数小时超时"缩短到"秒级失败"
  • Worker 资源不再被超大模型阻塞,pipeline 整体吞吐量提升
  • --max-model-size-b=auto 根据机器配置自适应,避免手动调参

- ConfigMetadataAnalyzer: add _estimate_param_count_billion() for rough
  param count estimation; analyze() now takes max_param_b to reject oversized
  models before extraction (prevents timeouts on 26B+ models)
- ConfigMetadataAnalyzer: add _estimate_oom_risk() based on param count
  (>7B→high, >3B→medium)
- GraphNetAgent: add max_model_size_b parameter and pass to analyzer
- parallel_extract: add --max-model-size-b CLI arg (default 'auto'),
  auto-calculates from total_RAM × 0.7 / workers / 4
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 20, 2026

Thanks for your contribution!

@luotao1 luotao1 merged commit 962ca5f into PaddlePaddle:develop May 21, 2026
3 checks passed
@luotao1 luotao1 deleted the param-estimation branch May 21, 2026 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants