Agent: reorganize workspace directory structure#714
Merged
Conversation
- workspace_manager.py: add success_dir, failed_dir, logs_and_lists_dir - graph_net_agent.py: auto-move samples to success/ or failed/ after extraction - parallel_extract.py: output JSON to logs_and_lists/ instead of workspace root
- Set GRAPH_NET_EXTRACT_WORKSPACE=workspace/samples/ in subprocess env - _get_workspace_path() defaults to samples/ subdir - Prevents clutter in workspace root from redundant model directories
|
Thanks for your contribution! |
Xreki
approved these changes
May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Category
Feature Enhancement
Description
Agent: 整理 workspace 目录结构
背景
批量抽取运行后,模型目录和结果文件混杂在 workspace 根目录:
这给后续分析、重跑和文件管理带来不便。
变更内容
1.
graph_net/agent/utils/workspace_manager.py新增三个目录属性:
success_dir—workspace/success/,存放抽取成功的模型样本failed_dir—workspace/failed/,存放抽取失败的模型目录logs_and_lists_dir—workspace/logs_and_lists/,存放结果 JSON 和模型列表_ensure_directories()初始化时自动创建以上三个目录。2.
graph_net/agent/graph_net_agent.pyextract_sample()中跟踪sample_dir,提取完成后自动 move 到对应目录:success/failed/(如目录已存在则覆盖)_move_sample()辅助方法,封装shutil.move及覆盖逻辑is_duplicate_sample()改为同时扫描success_dir和samples_dir,确保历史成功样本能正确命中去重检查_is_llm_fixable_error()方法用于判断是否值得 LLM 重试3.
graph_net/agent/parallel_extract.py批量抽取结果 JSON 默认输出路径从 workspace 根改为:
结果文件统一归档,无需手动整理。
4.
graph_net/agent/graph_extractor/subprocess_graph_extractor.py隔离抽取输出到
samples/子目录:GRAPH_NET_EXTRACT_WORKSPACE=workspace/samples/,避免结果散落在 workspace 根目录_get_workspace_path()默认返回samples/子目录验证
使用单个模型(
sshleifer/tiny-gpt2)在 CPU 模式下测试:结果:1 个模型,成功率 100%,样本正确移动到
success/。