Spaces:
Sleeping
CRITICAL FIX: strip ALL custom agent heuristics, return to original simplicity
Browse filesPROBLEM: Our 'improvements' were BREAKING the agent:
- Error classification injected [SYSTEM GUIDANCE] into tool output → confused model
- Fake [SYSTEM: Continue executing...] user messages → polluted conversation
- Loop detection (Level 1+2) → killed legitimate retries (pip→pip3)
- Consecutive error detection → broke loop after 3-5 errors
- Text-only continuation logic → injected fake messages that confused model
- Per-tool error tracking → added noise to tool results
WHAT THE ORIGINAL DOES (and what we now match):
- Simple loop: call LLM → execute tools → repeat
- NO error injection — raw tool output goes to model
- NO loop detection — trusts MAX_ITERATIONS (200) as safety net
- NO fake system messages — clean conversation history
- Model decides when to stop by not calling tools
- Only safety: exact same tool+args repeated 5x = break
REMOVED:
- [SYSTEM GUIDANCE: ...] injection in tool errors
- [SYSTEM: Continue executing...] fake user messages
- consecutiveTextOnly counter + continuation logic
- consecutiveToolErrors counter + break logic
- allToolsErrored tracking
- toolErrorCounts map + TOOL_ERROR_THRESHOLD
- Level 2 loop detection (same tool name)
- prevIterationAllErrored tracking
- todoLists import (no longer needed)
SIMPLIFIED system prompt:
- Removed aggressive CRITICAL ERROR HANDLING RULES
- Removed aggressive MULTI-STEP EXECUTION RULES
- Added simple ERROR HANDLING + EXECUTION STYLE sections
- No mention of [SYSTEM GUIDANCE] (we don't inject it anymore)
KEPT (good improvements):
- Interleaved content segments (UX only, no agent logic)
- JSON repair for malformed tool args
- Empty response retry
- 429/5xx infinite retry
- Auto-compaction on context overflow
- Proactive compaction at 80%
- ask_user break
- Minimal exact-match loop detection (5 repeats)
- server/runtime/agent.ts +25 -174
- server/runtime/system-prompt.ts +8 -18
|
@@ -5,7 +5,7 @@
|
|
| 5 |
|
| 6 |
import { ENV } from "../_core/env";
|
| 7 |
import { buildSystemPrompt, TOOL_DEFINITIONS } from "./system-prompt";
|
| 8 |
-
import { executeTool, getPlanMode, runPreToolHooks, runPostToolHooks, initializeMcpFromConfig, getMcpManager
|
| 9 |
import { compactSession, compactSessionWithLLM, shouldCompact, estimateSessionTokens, dbMessagesToSession, DEFAULT_COMPACTION_CONFIG } from "./compact";
|
| 10 |
import type { Session, ConversationMessage as CompactMessage, CompactionConfig } from "./compact";
|
| 11 |
import { UsageTracker, pricingForModel, defaultSonnetTierPricing, estimateCostUsdWithPricing, totalCostUsd, formatUsd, summaryLinesForModel } from "./usage";
|
|
@@ -410,21 +410,11 @@ export async function runAgentLoop(
|
|
| 410 |
const assistantMessages: AgentMessage[] = [];
|
| 411 |
const toolResultMessages: AgentMessage[] = [];
|
| 412 |
|
| 413 |
-
// ─── Loop detection:
|
| 414 |
-
//
|
| 415 |
-
//
|
| 416 |
-
const
|
| 417 |
-
const
|
| 418 |
-
let consecutiveTextOnly = 0; // count consecutive text-only responses (no tool calls)
|
| 419 |
-
let consecutiveToolErrors = 0; // count consecutive iterations where ALL tool calls return errors
|
| 420 |
-
const MAX_CONSECUTIVE_TOOL_ERRORS = 5; // break after 5 iterations of all-error tool calls — give model room to solve problems
|
| 421 |
-
let prevIterationAllErrored = false; // track if PREVIOUS iteration had all tools errored
|
| 422 |
-
|
| 423 |
-
// ─── Per-tool error tracking: detect when same tool keeps failing ─────
|
| 424 |
-
// Track how many times each tool has errored. If a tool errors 2+ times,
|
| 425 |
-
// inject guidance telling the model to STOP retrying that tool.
|
| 426 |
-
const toolErrorCounts: Map<string, number> = new Map();
|
| 427 |
-
const TOOL_ERROR_THRESHOLD = 3; // after 3 errors for same tool, inject stronger guidance
|
| 428 |
|
| 429 |
// ─── MCP Tools Dynamic Injection (matches original claw-code) ──────────
|
| 430 |
// Initialize MCP servers from config and merge discovered tools with static TOOL_DEFINITIONS.
|
|
@@ -745,40 +735,10 @@ export async function runAgentLoop(
|
|
| 745 |
conversationMessages.push(assistantMessage);
|
| 746 |
assistantMessages.push(assistantMessage);
|
| 747 |
|
| 748 |
-
// If no tool calls,
|
|
|
|
|
|
|
| 749 |
if (!result.toolCalls || result.toolCalls.length === 0) {
|
| 750 |
-
consecutiveTextOnly++;
|
| 751 |
-
|
| 752 |
-
// ─── Multi-step continuation logic ───────────────────────────
|
| 753 |
-
// Don't break on the first text-only response if there's work to do.
|
| 754 |
-
// The model often generates an intermediate text like "plan created,
|
| 755 |
-
// now starting execution" between planning and doing.
|
| 756 |
-
const currentTodos = todoLists.get(sessionId) || [];
|
| 757 |
-
const hasPendingTodos = currentTodos.some(t => t.status === "pending" || t.status === "in_progress");
|
| 758 |
-
const currentPlan = getPlanMode(sessionId);
|
| 759 |
-
const hasPendingPlan = currentPlan.active && currentPlan.steps.some(
|
| 760 |
-
(s: any) => s.status === "pending" || s.status === "in_progress"
|
| 761 |
-
);
|
| 762 |
-
const hasWorkToDo = hasPendingTodos || hasPendingPlan;
|
| 763 |
-
|
| 764 |
-
if (hasWorkToDo && consecutiveTextOnly <= 2) {
|
| 765 |
-
// Model said something but there are pending tasks — inject a nudge
|
| 766 |
-
// to continue executing, then loop again
|
| 767 |
-
console.info(`[agent] Text-only response but ${hasPendingTodos ? 'pending todos' : 'active plan'} exist — injecting continuation prompt (attempt ${consecutiveTextOnly})`);
|
| 768 |
-
conversationMessages.push({
|
| 769 |
-
role: "user",
|
| 770 |
-
content: "[SYSTEM: You have pending tasks in your plan. Continue executing the next step now using the appropriate tool. Do NOT just describe what you will do — actually DO it by calling tools.]" as string,
|
| 771 |
-
});
|
| 772 |
-
continue; // Don't break — go back to the top of the loop
|
| 773 |
-
}
|
| 774 |
-
|
| 775 |
-
// No pending work, or too many text-only responses — stop
|
| 776 |
-
if (consecutiveTextOnly >= 3) {
|
| 777 |
-
console.warn(`[agent] Text-only loop detected: ${consecutiveTextOnly} consecutive text responses without tool calls — breaking`);
|
| 778 |
-
sendSSE(res, "error", {
|
| 779 |
-
message: `⚠️ модель зациклилась в текстовых ответах. попробуй переформулировать запрос`,
|
| 780 |
-
});
|
| 781 |
-
}
|
| 782 |
sendSSE(res, "message_end", {
|
| 783 |
promptTokens: totalPromptTokens,
|
| 784 |
completionTokens: totalCompletionTokens,
|
|
@@ -787,26 +747,22 @@ export async function runAgentLoop(
|
|
| 787 |
});
|
| 788 |
break;
|
| 789 |
}
|
| 790 |
-
consecutiveTextOnly = 0; // reset on tool call
|
| 791 |
|
| 792 |
-
// ───
|
| 793 |
-
//
|
| 794 |
-
//
|
| 795 |
-
//
|
| 796 |
const currentToolSig = result.toolCalls.map((tc: any) => `${tc.function.name}:${tc.function.arguments}`).join("|");
|
| 797 |
-
|
| 798 |
-
|
| 799 |
-
|
| 800 |
-
recentTextResponses.shift();
|
| 801 |
}
|
| 802 |
-
|
| 803 |
-
|
| 804 |
-
if (recentTextResponses.length >= MAX_REPEAT_BEFORE_BREAK) {
|
| 805 |
-
const allSame = recentTextResponses.every(r => r === recentTextResponses[0]);
|
| 806 |
if (allSame) {
|
| 807 |
-
console.warn(`[agent]
|
| 808 |
sendSSE(res, "error", {
|
| 809 |
-
message: `⚠️ обнаружен
|
| 810 |
});
|
| 811 |
sendSSE(res, "message_end", {
|
| 812 |
promptTokens: totalPromptTokens,
|
|
@@ -818,18 +774,12 @@ export async function runAgentLoop(
|
|
| 818 |
}
|
| 819 |
}
|
| 820 |
|
| 821 |
-
// Level 2: REMOVED — was too aggressive.
|
| 822 |
-
// Using bash with different args is NORMAL problem-solving behavior:
|
| 823 |
-
// bash("pip install X") → error → bash("pip3 install X") → success
|
| 824 |
-
// Level 1 (exact match) + consecutive error counter are sufficient.
|
| 825 |
-
|
| 826 |
// ─── Execute tool calls ──────────────────────────────────────────
|
| 827 |
// Bug #2+#3 fix: Each tool call is wrapped in its own try-catch.
|
| 828 |
// Original claw-code sends tool errors back to LLM as tool results,
|
| 829 |
// letting the model decide how to handle them. We NEVER break the
|
| 830 |
// loop on a tool error — only on fatal API/stream errors.
|
| 831 |
let shouldBreakForAskUser = false;
|
| 832 |
-
let allToolsErrored = true; // track if ALL tool calls in this iteration errored
|
| 833 |
for (const toolCall of result.toolCalls) {
|
| 834 |
const toolName = toolCall.function.name;
|
| 835 |
let toolArgs: Record<string, unknown> = {};
|
|
@@ -908,75 +858,9 @@ export async function runAgentLoop(
|
|
| 908 |
isError = true;
|
| 909 |
}
|
| 910 |
|
| 911 |
-
//
|
| 912 |
-
|
| 913 |
-
|
| 914 |
-
// Reset error count for this tool on success
|
| 915 |
-
toolErrorCounts.delete(toolName);
|
| 916 |
-
} else {
|
| 917 |
-
// ─── FIX RC4: Error classification + guidance injection ─────────
|
| 918 |
-
// Classify the error and inject guidance so the model knows whether to retry
|
| 919 |
-
const errorCount = (toolErrorCounts.get(toolName) || 0) + 1;
|
| 920 |
-
toolErrorCounts.set(toolName, errorCount);
|
| 921 |
-
|
| 922 |
-
// ─── Smart error guidance: help the model SOLVE the problem ─────
|
| 923 |
-
// Instead of binary "stop" vs "retry", give the model actionable hints
|
| 924 |
-
// about what went wrong and how to fix it. The model should ALWAYS
|
| 925 |
-
// try to find a solution — stopping is the LAST resort.
|
| 926 |
-
const lowerOutput = toolOutput.toLowerCase();
|
| 927 |
-
|
| 928 |
-
// Only inject guidance for bash errors (other tools have clear errors)
|
| 929 |
-
if (toolName === 'bash') {
|
| 930 |
-
// Common fixable patterns with specific solutions
|
| 931 |
-
const bashFixes: Array<{ pattern: string; guidance: string }> = [
|
| 932 |
-
// Package manager aliases
|
| 933 |
-
{ pattern: 'pip: not found', guidance: 'Use `pip3` instead of `pip`. This environment has Python 3 with pip3.' },
|
| 934 |
-
{ pattern: 'python: not found', guidance: 'Use `python3` instead of `python`. This environment has Python 3.' },
|
| 935 |
-
{ pattern: 'node: not found', guidance: 'Node.js should be available. Try `which node` or install with `sudo apt-get install -y nodejs`.' },
|
| 936 |
-
// Permission issues
|
| 937 |
-
{ pattern: 'permission denied', guidance: 'Try with `sudo` prefix, or check file permissions with `ls -la`.' },
|
| 938 |
-
{ pattern: 'eacces', guidance: 'Permission error. Use `sudo` or fix permissions with `chmod`.' },
|
| 939 |
-
// Syntax issues
|
| 940 |
-
{ pattern: 'syntax error', guidance: 'Shell syntax error. Check for unmatched quotes, brackets, or special characters. Wrap complex strings in single quotes.' },
|
| 941 |
-
{ pattern: 'unexpected token', guidance: 'Shell parsing error. Likely unescaped special characters. Wrap arguments in quotes.' },
|
| 942 |
-
{ pattern: 'bad substitution', guidance: 'Variable substitution error. Use double quotes for variables or escape `$` with `\\$`.' },
|
| 943 |
-
// Missing tools
|
| 944 |
-
{ pattern: 'apt-get: not found', guidance: 'Try `sudo apt-get update && sudo apt-get install -y <package>` or use `apt` instead.' },
|
| 945 |
-
{ pattern: 'curl: not found', guidance: 'Install curl: `sudo apt-get update && sudo apt-get install -y curl`' },
|
| 946 |
-
{ pattern: 'wget: not found', guidance: 'Install wget: `sudo apt-get update && sudo apt-get install -y wget`, or use `curl` instead.' },
|
| 947 |
-
{ pattern: 'git: not found', guidance: 'Install git: `sudo apt-get update && sudo apt-get install -y git`' },
|
| 948 |
-
{ pattern: 'nmap: not found', guidance: 'Install nmap: `sudo apt-get update && sudo apt-get install -y nmap`' },
|
| 949 |
-
{ pattern: 'command not found', guidance: 'The command is not installed. Try installing it with `sudo apt-get install -y <package>` or use an alternative tool. For Python packages use `pip3 install <package>`.' },
|
| 950 |
-
{ pattern: 'not found', guidance: 'Command not found. Check spelling, try the full path, or install the missing package with `sudo apt-get install -y <package>`.' },
|
| 951 |
-
// Network issues
|
| 952 |
-
{ pattern: 'could not resolve host', guidance: 'DNS resolution failed. Check the hostname spelling or try a different DNS server.' },
|
| 953 |
-
{ pattern: 'connection refused', guidance: 'The target refused the connection. Check if the service is running and the port is correct.' },
|
| 954 |
-
{ pattern: 'connection timed out', guidance: 'Network timeout. The host may be unreachable. Try again or check the URL.' },
|
| 955 |
-
];
|
| 956 |
-
|
| 957 |
-
let guidanceAdded = false;
|
| 958 |
-
for (const fix of bashFixes) {
|
| 959 |
-
if (lowerOutput.includes(fix.pattern)) {
|
| 960 |
-
toolOutput += `\n\n[SYSTEM GUIDANCE: ${fix.guidance} Fix the command and continue with the task.]`;
|
| 961 |
-
guidanceAdded = true;
|
| 962 |
-
break;
|
| 963 |
-
}
|
| 964 |
-
}
|
| 965 |
-
|
| 966 |
-
// Generic bash error guidance if no specific fix matched
|
| 967 |
-
if (!guidanceAdded && errorCount >= 2) {
|
| 968 |
-
toolOutput += `\n\n[SYSTEM GUIDANCE: This command has failed ${errorCount} times. Analyze the error, try a different command or approach. Do NOT give up — find a solution.]`;
|
| 969 |
-
}
|
| 970 |
-
} else {
|
| 971 |
-
// Non-bash tool errors
|
| 972 |
-
const isFileMissing = lowerOutput.includes('enoent') || lowerOutput.includes('does not exist') || lowerOutput.includes('no such file');
|
| 973 |
-
if (isFileMissing) {
|
| 974 |
-
toolOutput += `\n\n[SYSTEM GUIDANCE: The file/path does not exist. Try a different path, create the file first, or use glob_search/grep_search to find the correct location.]`;
|
| 975 |
-
} else if (errorCount >= TOOL_ERROR_THRESHOLD) {
|
| 976 |
-
toolOutput += `\n\n[SYSTEM GUIDANCE: Tool '${toolName}' has failed ${errorCount} times. Try a completely different approach.]`;
|
| 977 |
-
}
|
| 978 |
-
}
|
| 979 |
-
}
|
| 980 |
|
| 981 |
sendSSE(res, "tool_result", {
|
| 982 |
toolCallId: toolCall.id,
|
|
@@ -1018,41 +902,8 @@ export async function runAgentLoop(
|
|
| 1018 |
toolResultMessages.push(toolResultMsg);
|
| 1019 |
}
|
| 1020 |
|
| 1021 |
-
//
|
| 1022 |
-
//
|
| 1023 |
-
// After MAX_CONSECUTIVE_TOOL_ERRORS iterations of all-error results,
|
| 1024 |
-
// the model is stuck and can't solve the task — break the loop.
|
| 1025 |
-
// Save current iteration's error state for next iteration's Level 2 detection
|
| 1026 |
-
prevIterationAllErrored = allToolsErrored;
|
| 1027 |
-
|
| 1028 |
-
if (allToolsErrored) {
|
| 1029 |
-
consecutiveToolErrors++;
|
| 1030 |
-
console.warn(`[agent] All tool calls errored — consecutive error count: ${consecutiveToolErrors}/${MAX_CONSECUTIVE_TOOL_ERRORS}`);
|
| 1031 |
-
|
| 1032 |
-
// At halfway point, inject a strong nudge to change approach
|
| 1033 |
-
if (consecutiveToolErrors === 3) {
|
| 1034 |
-
conversationMessages.push({
|
| 1035 |
-
role: "user",
|
| 1036 |
-
content: "[SYSTEM: Your last 3 attempts ALL failed. You MUST change your approach NOW. Try a completely different tool, method, or strategy. If a command is not found, install it or use an alternative. Do NOT repeat similar failing commands.]",
|
| 1037 |
-
});
|
| 1038 |
-
}
|
| 1039 |
-
|
| 1040 |
-
if (consecutiveToolErrors >= MAX_CONSECUTIVE_TOOL_ERRORS) {
|
| 1041 |
-
console.error(`[agent] ${MAX_CONSECUTIVE_TOOL_ERRORS} consecutive iterations with all tool errors — breaking loop`);
|
| 1042 |
-
sendSSE(res, "error", {
|
| 1043 |
-
message: `⚠️ модель не может выполнить задачу — ${MAX_CONSECUTIVE_TOOL_ERRORS} попыток подряд завершились ошибками. попробуй переформулировать запрос или начать новую сессию`,
|
| 1044 |
-
});
|
| 1045 |
-
sendSSE(res, "message_end", {
|
| 1046 |
-
promptTokens: totalPromptTokens,
|
| 1047 |
-
completionTokens: totalCompletionTokens,
|
| 1048 |
-
cost: totalCost,
|
| 1049 |
-
model: apiConfig.model,
|
| 1050 |
-
});
|
| 1051 |
-
break;
|
| 1052 |
-
}
|
| 1053 |
-
} else {
|
| 1054 |
-
consecutiveToolErrors = 0; // reset on successful tool execution
|
| 1055 |
-
}
|
| 1056 |
|
| 1057 |
// ─── Bug #4 fix: Break loop after ask_user to wait for user ───
|
| 1058 |
if (shouldBreakForAskUser) {
|
|
|
|
| 5 |
|
| 6 |
import { ENV } from "../_core/env";
|
| 7 |
import { buildSystemPrompt, TOOL_DEFINITIONS } from "./system-prompt";
|
| 8 |
+
import { executeTool, getPlanMode, runPreToolHooks, runPostToolHooks, initializeMcpFromConfig, getMcpManager } from "../tools/executor";
|
| 9 |
import { compactSession, compactSessionWithLLM, shouldCompact, estimateSessionTokens, dbMessagesToSession, DEFAULT_COMPACTION_CONFIG } from "./compact";
|
| 10 |
import type { Session, ConversationMessage as CompactMessage, CompactionConfig } from "./compact";
|
| 11 |
import { UsageTracker, pricingForModel, defaultSonnetTierPricing, estimateCostUsdWithPricing, totalCostUsd, formatUsd, summaryLinesForModel } from "./usage";
|
|
|
|
| 410 |
const assistantMessages: AgentMessage[] = [];
|
| 411 |
const toolResultMessages: AgentMessage[] = [];
|
| 412 |
|
| 413 |
+
// ─── Loop detection: minimal safety net ─────────────────────────────
|
| 414 |
+
// Only detect EXACT same tool+args repeated 5+ times (true infinite loop).
|
| 415 |
+
// Everything else is handled by MAX_ITERATIONS.
|
| 416 |
+
const recentToolSignatures: string[] = [];
|
| 417 |
+
const MAX_EXACT_REPEATS = 5;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 418 |
|
| 419 |
// ─── MCP Tools Dynamic Injection (matches original claw-code) ──────────
|
| 420 |
// Initialize MCP servers from config and merge discovered tools with static TOOL_DEFINITIONS.
|
|
|
|
| 735 |
conversationMessages.push(assistantMessage);
|
| 736 |
assistantMessages.push(assistantMessage);
|
| 737 |
|
| 738 |
+
// If no tool calls, we're done — the LLM has finished responding.
|
| 739 |
+
// This matches the original claw-code behavior exactly:
|
| 740 |
+
// the model decides when to stop by not calling tools.
|
| 741 |
if (!result.toolCalls || result.toolCalls.length === 0) {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 742 |
sendSSE(res, "message_end", {
|
| 743 |
promptTokens: totalPromptTokens,
|
| 744 |
completionTokens: totalCompletionTokens,
|
|
|
|
| 747 |
});
|
| 748 |
break;
|
| 749 |
}
|
|
|
|
| 750 |
|
| 751 |
+
// ─── Minimal loop detection: only catch TRUE infinite loops ───────
|
| 752 |
+
// Only break if the EXACT same tool+args is repeated 5+ times.
|
| 753 |
+
// This is the only safety net beyond MAX_ITERATIONS.
|
| 754 |
+
// The original claw-code has NO loop detection at all — it trusts the model.
|
| 755 |
const currentToolSig = result.toolCalls.map((tc: any) => `${tc.function.name}:${tc.function.arguments}`).join("|");
|
| 756 |
+
recentToolSignatures.push(currentToolSig);
|
| 757 |
+
if (recentToolSignatures.length > MAX_EXACT_REPEATS) {
|
| 758 |
+
recentToolSignatures.shift();
|
|
|
|
| 759 |
}
|
| 760 |
+
if (recentToolSignatures.length >= MAX_EXACT_REPEATS) {
|
| 761 |
+
const allSame = recentToolSignatures.every(r => r === recentToolSignatures[0]);
|
|
|
|
|
|
|
| 762 |
if (allSame) {
|
| 763 |
+
console.warn(`[agent] Infinite loop detected: exact same tool call repeated ${MAX_EXACT_REPEATS} times — breaking`);
|
| 764 |
sendSSE(res, "error", {
|
| 765 |
+
message: `⚠️ обнаружен бесконечный цикл. попробуй переформулировать запрос`,
|
| 766 |
});
|
| 767 |
sendSSE(res, "message_end", {
|
| 768 |
promptTokens: totalPromptTokens,
|
|
|
|
| 774 |
}
|
| 775 |
}
|
| 776 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 777 |
// ─── Execute tool calls ──────────────────────────────────────────
|
| 778 |
// Bug #2+#3 fix: Each tool call is wrapped in its own try-catch.
|
| 779 |
// Original claw-code sends tool errors back to LLM as tool results,
|
| 780 |
// letting the model decide how to handle them. We NEVER break the
|
| 781 |
// loop on a tool error — only on fatal API/stream errors.
|
| 782 |
let shouldBreakForAskUser = false;
|
|
|
|
| 783 |
for (const toolCall of result.toolCalls) {
|
| 784 |
const toolName = toolCall.function.name;
|
| 785 |
let toolArgs: Record<string, unknown> = {};
|
|
|
|
| 858 |
isError = true;
|
| 859 |
}
|
| 860 |
|
| 861 |
+
// No error classification or guidance injection.
|
| 862 |
+
// The model receives raw error output and decides how to handle it.
|
| 863 |
+
// This matches the original claw-code behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 864 |
|
| 865 |
sendSSE(res, "tool_result", {
|
| 866 |
toolCallId: toolCall.id,
|
|
|
|
| 902 |
toolResultMessages.push(toolResultMsg);
|
| 903 |
}
|
| 904 |
|
| 905 |
+
// No consecutive error detection — the model handles errors naturally.
|
| 906 |
+
// MAX_ITERATIONS (200) is the ultimate safety net.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 907 |
|
| 908 |
// ─── Bug #4 fix: Break loop after ask_user to wait for user ───
|
| 909 |
if (shouldBreakForAskUser) {
|
|
@@ -299,25 +299,15 @@ AGENT AUTONOMY RULES:
|
|
| 299 |
9. If the user asks for something, the answer is always "yes" followed by the implementation.
|
| 300 |
10. After completing a task, ALWAYS suggest what to do next — propose improvements, next steps, or related tasks. Be proactive.
|
| 301 |
|
| 302 |
-
|
| 303 |
-
- When a tool fails,
|
| 304 |
-
-
|
| 305 |
-
- If
|
| 306 |
-
- If permission is denied, try with sudo.
|
| 307 |
-
- If a network request fails, try a different URL, tool, or approach.
|
| 308 |
-
- NEVER stop the entire task because ONE command failed. Errors are NORMAL during complex tasks. Diagnose → Fix → Continue.
|
| 309 |
-
- If the SAME exact command fails 3+ times with the same error, try a COMPLETELY different approach (different tool, different method).
|
| 310 |
-
- Only ask the user for help if you have genuinely exhausted all alternatives and cannot proceed.
|
| 311 |
-
- The [SYSTEM GUIDANCE] hints in error messages tell you exactly how to fix the issue. Follow them.
|
| 312 |
|
| 313 |
-
|
| 314 |
-
- You are an
|
| 315 |
-
-
|
| 316 |
-
- After each
|
| 317 |
-
- If you have pending tasks in your TodoWrite list, ALWAYS call a tool in your next response. Text-only responses with pending tasks are a BUG.
|
| 318 |
-
- Pattern: TodoWrite (create plan) → bash/read_file/write_file (execute step 1) → bash (execute step 2) → ... → SendUserMessage (report results)
|
| 319 |
-
- WRONG pattern: TodoWrite → "now I will start" (text only, no tool) → STOPS. This is broken. Always include a tool call.
|
| 320 |
-
- When you finish a step, update the todo status to "completed" and immediately proceed to the next pending step.`
|
| 321 |
);
|
| 322 |
|
| 323 |
|
|
|
|
| 299 |
9. If the user asks for something, the answer is always "yes" followed by the implementation.
|
| 300 |
10. After completing a task, ALWAYS suggest what to do next — propose improvements, next steps, or related tasks. Be proactive.
|
| 301 |
|
| 302 |
+
ERROR HANDLING:
|
| 303 |
+
- When a tool fails, analyze the error and try a different approach. Common fixes: pip→pip3, python→python3, add sudo for permissions, install missing packages with apt-get.
|
| 304 |
+
- Errors are normal during complex tasks. Diagnose, fix, and continue.
|
| 305 |
+
- If the same approach fails 3+ times, try something completely different.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 306 |
|
| 307 |
+
EXECUTION STYLE:
|
| 308 |
+
- You are an autonomous agent. After creating a plan, immediately start executing it with tool calls.
|
| 309 |
+
- Always include tool calls in your responses when there is work to do. Do not just describe what you will do — do it.
|
| 310 |
+
- After completing each step, proceed to the next one without waiting.`
|
|
|
|
|
|
|
|
|
|
|
|
|
| 311 |
);
|
| 312 |
|
| 313 |
|