|
| 1 | +# OpenDerisk ReActMasterV2 Issues Analysis |
| 2 | + |
| 3 | +## Executive Summary |
| 4 | + |
| 5 | +Three critical issues identified in the ReActMasterV2 + vis_window3 system: |
| 6 | + |
| 7 | +1. **Skill Loading Failure**: Skill metadata not loaded into agent prompt |
| 8 | +2. **Frontend Rendering Issue**: Running window not displaying task lists |
| 9 | +3. **Tool Call Truncation**: Agent generating pure text without function calls |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## Issue #1: Skill Not Loaded Into Prompt 🔴 CRITICAL |
| 14 | + |
| 15 | +### Root Cause |
| 16 | +**Skill Code Mismatch** |
| 17 | + |
| 18 | +**Configuration** (`rca_openrca_app.json`): |
| 19 | +```json |
| 20 | +{ |
| 21 | + "type": "skill(derisk)", |
| 22 | + "value": "{\"skill_name\":\"open_rca_diagnosis\",\"skillCode\":\"open-rca-diagnosis\"}" |
| 23 | +} |
| 24 | +``` |
| 25 | + |
| 26 | +**Database Reality** (`server_app_skill` table): |
| 27 | +```sql |
| 28 | +skill_code: "open-rca-diagnosis-2-0-derisk-c5b0e208" |
| 29 | +name: "open_rca_diagnosis" |
| 30 | +``` |
| 31 | + |
| 32 | +### Impact |
| 33 | +- AgentSkillResource not initialized properly |
| 34 | +- Skill metadata not added to system prompt |
| 35 | +- Agent lacks domain knowledge and tools from skill |
| 36 | + |
| 37 | +### Evidence Chain |
| 38 | + |
| 39 | +1. **App Config** references `skill_name: "open_rca_diagnosis"` |
| 40 | +2. **Database Query** by `skill_code = "open-rca-diagnosis"` returns empty |
| 41 | +3. **Skill Loading** fails silently in `AgentSkillResource.__init__` |
| 42 | +4. **Prompt Generation** returns "No Skills provided" |
| 43 | + |
| 44 | +### Fix Required |
| 45 | + |
| 46 | +**Option A: Update App Configuration** (Recommended) |
| 47 | +```json |
| 48 | +{ |
| 49 | + "type": "skill(derisk)", |
| 50 | + "value": "{\"skill_name\":\"open_rca_diagnosis\",\"skillCode\":\"open-rca-diagnosis-2-0-derisk-c5b0e208\"}" |
| 51 | +} |
| 52 | +``` |
| 53 | + |
| 54 | +**Option B: Update Database** |
| 55 | +```sql |
| 56 | +UPDATE server_app_skill |
| 57 | +SET skill_code = 'open-rca-diagnosis' |
| 58 | +WHERE name = 'open_rca_diagnosis'; |
| 59 | +``` |
| 60 | + |
| 61 | +**Option C: Fix Skill Loader to Use `name` instead of `skill_code`** |
| 62 | + |
| 63 | +--- |
| 64 | + |
| 65 | +## Issue #2: Frontend Running Window Not Displaying Tasks 🔴 CRITICAL |
| 66 | + |
| 67 | +### Symptom |
| 68 | +- Running window area shows no task lists |
| 69 | +- Cannot see model tasks, tool task lists |
| 70 | +- Task content not rendered |
| 71 | + |
| 72 | +### Root Cause Analysis |
| 73 | + |
| 74 | +#### Backend Data Flow |
| 75 | +1. **vis_window3 Converter** (`derisk_vis_window3_converter.py`) |
| 76 | + - Generates `running_window` data with `WorkSpace` component |
| 77 | + - `WorkSpaceContent` structure: |
| 78 | + ```python |
| 79 | + { |
| 80 | + "uid": conv_session_id, |
| 81 | + "type": "INCR", |
| 82 | + "running_agents": [...], |
| 83 | + "items": [FolderNode...], # Task items |
| 84 | + "explorer": "..." # Agent folder tree |
| 85 | + } |
| 86 | + ``` |
| 87 | + |
| 88 | +2. **Data Generation** (`gen_work_item()`) |
| 89 | + - Creates `FolderNode` for each action/LLM call |
| 90 | + - Task types: `llm`, `tool`, `code`, `report`, `knowledge` |
| 91 | + - Each item has: `uid`, `title`, `description`, `status`, `markdown` |
| 92 | + |
| 93 | +#### Frontend Rendering Logic |
| 94 | + |
| 95 | +**VisRunningWindow Component** (`index.tsx`): |
| 96 | +```typescript |
| 97 | +interface RunningAgent { |
| 98 | + items?: any[]; // Task items |
| 99 | + agent_name?: string; |
| 100 | + markdown?: string; |
| 101 | +} |
| 102 | + |
| 103 | +// Data structure expected: |
| 104 | +data = { |
| 105 | + items: RunningAgent[], // Array of agent tasks |
| 106 | + running_agent: string | string[] |
| 107 | +} |
| 108 | +``` |
| 109 | + |
| 110 | +**Key Rendering Logic**: |
| 111 | +```typescript |
| 112 | +const runningAgents = keyBy(dataItems, 'agent_name'); |
| 113 | +// dataItems = data.items |
| 114 | +// Each item should have: agent_name, items array |
| 115 | + |
| 116 | +const hasItems = Array.isArray(items) && items.length > 0; |
| 117 | + |
| 118 | +// If hasItems, render CoderWindow with tabs |
| 119 | +// Otherwise render markdown directly |
| 120 | +``` |
| 121 | + |
| 122 | +### Hypothesis |
| 123 | + |
| 124 | +**Problem**: Backend sends `WorkSpaceContent.items` as `FolderNode[]`, but frontend expects `RunningAgent[]` |
| 125 | + |
| 126 | +**Mismatch**: |
| 127 | +- Backend sends: `{ items: [FolderNode{uid, title, task_type, markdown}] }` |
| 128 | +- Frontend expects: `{ items: [RunningAgent{agent_name, items: [...]}] }` |
| 129 | + |
| 130 | +### Investigation Needed |
| 131 | + |
| 132 | +1. Check actual API response structure |
| 133 | +2. Verify frontend parse-vis.ts transformation |
| 134 | +3. Check if vis_window3 converter is using wrong data format |
| 135 | + |
| 136 | +### Potential Fix |
| 137 | + |
| 138 | +**Backend needs to transform**: |
| 139 | +```python |
| 140 | +# Current (Wrong): |
| 141 | +work_space_content = WorkSpaceContent(items=[FolderNode...]) |
| 142 | + |
| 143 | +# Should be (Correct): |
| 144 | +work_space_content = WorkSpaceContent( |
| 145 | + items=[ |
| 146 | + RunningAgent( |
| 147 | + agent_name=agent_name, |
| 148 | + avatar=agent_avatar, |
| 149 | + items=[FolderNode...] # Actual tasks |
| 150 | + ) |
| 151 | + ] |
| 152 | +) |
| 153 | +``` |
| 154 | + |
| 155 | +--- |
| 156 | + |
| 157 | +## Issue #3: Agent Generating Pure Text Without Tool Calls 🟡 HIGH |
| 158 | + |
| 159 | +### Symptom |
| 160 | +- Multiple pure text outputs without tool calls |
| 161 | +- Agent not executing tools as expected |
| 162 | +- Truncated responses |
| 163 | + |
| 164 | +### Root Cause Analysis |
| 165 | + |
| 166 | +#### ReActReasoningAgent Flow |
| 167 | + |
| 168 | +1. **think()** - LLM generation with function calling |
| 169 | + ```python |
| 170 | + response = await self.llm_client.generate( |
| 171 | + messages=messages, |
| 172 | + tools=tools, # Tool definitions |
| 173 | + tool_choice="auto" |
| 174 | + ) |
| 175 | + ``` |
| 176 | + |
| 177 | +2. **decide()** - Check for tool calls |
| 178 | + ```python |
| 179 | + if response.tool_calls: |
| 180 | + return Decision(type=TOOL_CALL, ...) |
| 181 | + else: |
| 182 | + return Decision(type=RESPONSE, content=response.content) |
| 183 | + ``` |
| 184 | + |
| 185 | +### Possible Causes |
| 186 | + |
| 187 | +#### A. System Prompt Issues |
| 188 | +```python |
| 189 | +REACT_REASONING_SYSTEM_PROMPT = """... |
| 190 | +## 立即行动 |
| 191 | +现在请调用工具开始执行任务!不要只是思考或总结。 |
| 192 | +""" |
| 193 | +``` |
| 194 | +- Prompt may not be compelling enough |
| 195 | +- Missing resource/skill prompts |
| 196 | + |
| 197 | +#### B. LLM Model Issues |
| 198 | +- **Model**: `DeepSeek-V3` (from app config) |
| 199 | +- May not support function calling properly |
| 200 | +- May need explicit tool selection instructions |
| 201 | + |
| 202 | +#### C. Tool Definition Issues |
| 203 | +```python |
| 204 | +def _build_tool_definitions(self): |
| 205 | + # Returns OpenAI-compatible tool schemas |
| 206 | + # Check if tools are properly defined |
| 207 | +``` |
| 208 | + |
| 209 | +#### D. Missing Resource Prompts |
| 210 | +```python |
| 211 | +async def _build_resource_prompt(self): |
| 212 | + # Build prompt with: |
| 213 | + # - available_agents |
| 214 | + # - available_knowledges |
| 215 | + # - available_skills ⬅️ THIS IS EMPTY due to Issue #1 |
| 216 | + # - other_resources |
| 217 | +``` |
| 218 | + |
| 219 | +**Key Issue**: Because skill is not loaded (Issue #1), the agent lacks: |
| 220 | +- Domain knowledge |
| 221 | +- Specialized tools |
| 222 | +- Context for what tools to use |
| 223 | + |
| 224 | +### Impact Chain |
| 225 | + |
| 226 | +1. **Skill not loaded** → Empty resource prompt |
| 227 | +2. **Agent doesn't know available tools** → Generates generic text |
| 228 | +3. **No tool selection guidance** → Avoids calling tools |
| 229 | +4. **Result**: Pure text responses |
| 230 | + |
| 231 | +--- |
| 232 | + |
| 233 | +## Recommended Fix Priority |
| 234 | + |
| 235 | +### Priority 1: Fix Skill Loading (Issue #1) |
| 236 | +**Impact**: Fixes both #1 and potentially #3 |
| 237 | + |
| 238 | +**Steps**: |
| 239 | +1. Update skill_code in app config or database |
| 240 | +2. Verify AgentSkillResource initialization |
| 241 | +3. Test skill prompt appears in system prompt |
| 242 | + |
| 243 | +### Priority 2: Fix Frontend Data Format (Issue #2) |
| 244 | +**Impact**: Restores task visibility |
| 245 | + |
| 246 | +**Steps**: |
| 247 | +1. Verify backend data structure matches frontend expectations |
| 248 | +2. Check vis_window3 converter output |
| 249 | +3. Update data transformation if needed |
| 250 | + |
| 251 | +### Priority 3: Verify Tool Calling (Issue #3) |
| 252 | +**Impact**: Agent behavior |
| 253 | + |
| 254 | +**Steps**: |
| 255 | +1. Verify LLM model supports function calling |
| 256 | +2. Check tool definitions are correct |
| 257 | +3. Ensure system prompt is clear about tool usage |
| 258 | +4. Fix skill loading first (may resolve this) |
| 259 | + |
| 260 | +--- |
| 261 | + |
| 262 | +## Testing Checklist |
| 263 | + |
| 264 | +### Issue #1 Test |
| 265 | +```bash |
| 266 | +# 1. Check database skill |
| 267 | +sqlite3 pilot/meta_data/derisk.db "SELECT skill_code, name FROM server_app_skill WHERE name='open_rca_diagnosis';" |
| 268 | + |
| 269 | +# 2. Check app config |
| 270 | +cat packages/derisk-serve/src/derisk_serve/building/app/service/derisk_app_define/rca_openrca_app.json | grep -A5 "skill_name" |
| 271 | + |
| 272 | +# 3. Check agent initialization logs |
| 273 | +# Look for: "[ReActReasoningAgent] 资源预加载完成: tools_count=X, resource_prompt_len=Y" |
| 274 | +``` |
| 275 | + |
| 276 | +### Issue #2 Test |
| 277 | +```bash |
| 278 | +# 1. Start app and check running window data |
| 279 | +curl http://localhost:7777/api/v1/chat/completions -X POST -d '{"user_input":"test","conv_uid":"test123"}' |
| 280 | + |
| 281 | +# 2. Check frontend console for parse errors |
| 282 | +# 3. Inspect vis_window3 API response structure |
| 283 | +``` |
| 284 | + |
| 285 | +### Issue #3 Test |
| 286 | +```python |
| 287 | +# In agent logs, check: |
| 288 | +# - "调用 LLM: 消息数=X, 工具数=Y" |
| 289 | +# - "LLM 返回纯文本回答,任务可能已完成" (early termination warning) |
| 290 | +# - "工具调用: {tool_name}" (should see this for each tool) |
| 291 | +``` |
| 292 | + |
| 293 | +--- |
| 294 | + |
| 295 | +## Next Steps |
| 296 | + |
| 297 | +1. **Immediate**: Fix skill_code mismatch in app config |
| 298 | +2. **Verify**: Check if Issue #3 resolves after fixing Issue #1 |
| 299 | +3. **Debug**: Capture actual API response for Issue #2 |
| 300 | +4. **Monitor**: Check agent logs for tool call patterns |
0 commit comments