Agentic-Service-Data-Eyond-Catalog

Sleeping

Rifqi Hafizuddin Claude Opus 4.8 commited on 1 day ago

Commit

ac310de

1 Parent(s): f346114

[KM-626][AI] Slow-path agent: seam contracts + TaskRunner

The execution half of the slow path (AGENT_ARCHITECTURE_CONTEXT_new.md §7.4,
§8.2/§8.4). Deterministic, 0 LLM, tool-agnostic (INV-7).

- schemas.py: TaskResult, RunState (the blackboard); TaskSummary, AnalysisRecord,
AssembledOutput, AssemblerNarrative. Reuses planner ToolOutput.
- invoker.py: ToolInvoker Protocol only -- the runtime seam the runner calls; the
tool team owns the implementation (KM-418).
- errors.py: SlowPathError, AssemblerError.
- task_runner.py: wave-based dependency execution (asyncio.gather), ${t<id>}
placeholder resolution, internal validate_args, never-throw invoke, status
labeling, degrade-and-continue. No replanning, no mid-run LLM.

Lives in agents/slow_path/ -- NOT "orchestrator" (that name is the entry
dispatcher in agents/orchestration.py).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Files changed (5) hide show

src/agents/slow_path/__init__.py +10 -0
src/agents/slow_path/errors.py +11 -0
src/agents/slow_path/invoker.py +27 -0
src/agents/slow_path/schemas.py +99 -0
src/agents/slow_path/task_runner.py +168 -0

src/agents/slow_path/__init__.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""Slow-path workers: TaskRunner (deterministic) + Assembler (1 LLM call) + Coordinator.
+These are driven *by* the Orchestrator (the intent-router/dispatcher in
+`agents/orchestration.py`); this package is deliberately NOT named "orchestrator"
+to keep the dispatcher and the workers from sharing a name. It executes the
+Planner's static `TaskList` and assembles the two outputs (`chat_answer` +
+`AnalysisRecord`). See AGENT_ARCHITECTURE_CONTEXT_new.md §7.2 / §7.4 / §7.5 /
+§8.2–8.4. Tool-agnostic: depends only on the `ToolInvoker` protocol and the
+`ToolOutput` envelope, never on a specific tool (INV-7).
+"""

src/agents/slow_path/errors.py ADDED Viewed

	@@ -0,0 +1,11 @@

+"""Typed errors for the slow-path layer."""
+from __future__ import annotations
+class SlowPathError(Exception):
+    """Base error for the slow-path layer."""
+class AssemblerError(SlowPathError):
+    """The Assembler LLM call could not produce a valid `AssembledOutput`."""

src/agents/slow_path/invoker.py ADDED Viewed

	@@ -0,0 +1,27 @@

+"""The tool invocation seam (§8.4) — the one interface the TaskRunner calls.
+The agent layer stays tool-agnostic (INV-7) by invoking every tool through this
+protocol, never importing a tool module directly. The **tool team owns the
+implementation** (KM-418); this file defines only the contract the TaskRunner
+depends on.
+Frozen guarantees the implementation must hold:
+1. **Never throws.** A tool failure returns `ToolOutput(kind="error", error=...)`,
+   not an exception — the TaskRunner's degrade-and-continue (§7.4) relies on this.
+   (The TaskRunner still wraps calls defensively, as a backstop.)
+2. **Returns the `ToolOutput` envelope** (§8.1) — structured data only, never
+   rendered tables or prose (that is the Assembler's job).
+3. **`tool_name` comes from the registry** (§9.2); unknown names return an error
+   envelope rather than throwing.
+"""
+from __future__ import annotations
+from typing import Any, Protocol, runtime_checkable
+from ..planner.contracts import ToolOutput
+@runtime_checkable
+class ToolInvoker(Protocol):
+    async def invoke(self, tool_name: str, args: dict[str, Any]) -> ToolOutput: ...

src/agents/slow_path/schemas.py ADDED Viewed

	@@ -0,0 +1,99 @@

+"""Slow-path execution + output contracts.
+The seams between the three slow-path stages:
+- TaskRunner writes `RunState` (a blackboard of `TaskResult`s) — §8.2.
+- Assembler reads `RunState` + `BusinessContext` and produces `AssembledOutput`
+  (`chat_answer` + `AnalysisRecord`) — §8.3.
+`ToolOutput` (the tool -> agent envelope) is reused from the planner contracts so
+there is exactly one definition across the layer.
+Note on authorship (§8.3): the Assembler LLM authors only the *narrative* fields
+(`AssemblerNarrative`). The `AnalysisRecord`'s structured pass-through fields
+(`results_snapshot`, `tasks_run`) and metadata are copied from `RunState` by code,
+never re-authored by the model — that is the source of truth the report generator
+renders from (INV-4).
+See AGENT_ARCHITECTURE_CONTEXT_new.md §8.2 / §8.3.
+"""
+from __future__ import annotations
+from datetime import datetime
+from typing import Literal
+from pydantic import BaseModel, Field
+from ..planner.contracts import ToolOutput
+TaskStatus = Literal["success", "partial", "failure"]
+# --------------------------------------------------------------------------- #
+# Execution state (TaskRunner -> Assembler) — §8.2
+# --------------------------------------------------------------------------- #
+class TaskResult(BaseModel):
+    task_id: str
+    status: TaskStatus
+    objective: str
+    outputs: list[ToolOutput] = Field(default_factory=list)  # one per tool_call
+    note: str | None = None
+    error: str | None = None
+class RunState(BaseModel):
+    plan_id: str
+    business_context_id: str
+    results: dict[str, TaskResult] = Field(default_factory=dict)  # task_id -> result
+    open_questions: list[str] = Field(default_factory=list)
+# --------------------------------------------------------------------------- #
+# Assembled output (Assembler -> Orchestrator / memory) — §8.3
+# --------------------------------------------------------------------------- #
+class TaskSummary(BaseModel):
+    task_id: str
+    objective: str
+    status: TaskStatus
+    tools_used: list[str] = Field(default_factory=list)
+class AnalysisRecord(BaseModel):
+    # Narrative fields — authored by the Assembler LLM.
+    goal_restated: str
+    findings: list[str] = Field(default_factory=list)
+    caveats: list[str] = Field(default_factory=list)
+    data_used: list[str] = Field(default_factory=list)
+    open_questions: list[str] = Field(default_factory=list)
+    # Structured pass-through — NOT re-authored; copied from RunState.
+    tasks_run: list[TaskSummary] = Field(default_factory=list)
+    results_snapshot: dict[str, TaskResult] = Field(default_factory=dict)
+    # Metadata.
+    plan_id: str
+    business_context_id: str
+    created_at: datetime
+class AssembledOutput(BaseModel):
+    chat_answer: str  # FIRST field — streams via SSE; markdown prose + tables
+    analysis_record: AnalysisRecord
+class AssemblerNarrative(BaseModel):
+    """The subset of `AnalysisRecord` the Assembler LLM actually authors.
+    Kept separate from `AssembledOutput` so the model never emits the structured
+    pass-through fields (which would invite hallucinated numbers); `Assembler`
+    code merges this with the real `RunState` to build the final record.
+    """
+    chat_answer: str
+    goal_restated: str
+    findings: list[str] = Field(default_factory=list)
+    caveats: list[str] = Field(default_factory=list)
+    data_used: list[str] = Field(default_factory=list)
+    open_questions: list[str] = Field(default_factory=list)

src/agents/slow_path/task_runner.py ADDED Viewed

	@@ -0,0 +1,168 @@

+"""TaskRunner — deterministic execution of a static `TaskList`. Zero LLM.
+Executes tasks in dependency order, parallelizing each ready "wave" with
+`asyncio.gather`. For each task it resolves `${t<id>}` placeholders from upstream
+results, does an internal `validate_args`, invokes each tool via the `ToolInvoker`
+seam, and records a `TaskResult`. On failure it **degrades and continues**: the
+task is marked failed, its dependents are skipped, independent branches keep
+running. There is no replanning and no mid-run LLM (INV-6).
+`success_criteria` is *not* machine-evaluated here (it is free text); task status
+is derived from tool execution outcomes and carried to the Assembler to report.
+See AGENT_ARCHITECTURE_CONTEXT_new.md §7.4.
+"""
+from __future__ import annotations
+import asyncio
+import re
+from typing import Any
+from src.middlewares.logging import get_logger
+from ..planner.contracts import ToolOutput, ToolRegistry
+from ..planner.schemas import Task
+from ..planner.schemas import TaskList as PlanTaskList
+from .invoker import ToolInvoker
+from .schemas import RunState, TaskResult, TaskStatus
+logger = get_logger("task_runner")
+# Mirrors planner/validator.py:28 `_PLACEHOLDER_RE` — keep the two in sync.
+_PLACEHOLDER_RE = re.compile(r"\$\{(t[^}]+)\}")
+class TaskRunner:
+    """Runs a `TaskList` against a `ToolInvoker`, producing a `RunState`."""
+    def __init__(self, invoker: ToolInvoker, registry: ToolRegistry) -> None:
+        self._invoker = invoker
+        self._registry = registry
+    async def run(self, task_list: PlanTaskList, business_context_id: str) -> RunState:
+        tasks_by_id: dict[str, Task] = {t.id: t for t in task_list.tasks}
+        results: dict[str, TaskResult] = {}
+        remaining: set[str] = set(tasks_by_id)
+        while remaining:
+            ready = [
+                tid
+                for tid in remaining
+                if all(dep in results for dep in tasks_by_id[tid].depends_on)
+            ]
+            if not ready:
+                # A dependency points outside the plan (or a cycle slipped past the
+                # planner validator): nothing more can run. Fail the rest honestly.
+                for tid in list(remaining):
+                    results[tid] = TaskResult(
+                        task_id=tid,
+                        status="failure",
+                        objective=tasks_by_id[tid].objective,
+                        error="unresolved dependency; task could not run",
+                    )
+                    remaining.discard(tid)
+                break
+            # Skip any ready task whose dependency failed (degrade-and-continue).
+            to_run: list[Task] = []
+            for tid in ready:
+                task = tasks_by_id[tid]
+                failed = [d for d in task.depends_on if results[d].status == "failure"]
+                if failed:
+                    results[tid] = TaskResult(
+                        task_id=tid,
+                        status="failure",
+                        objective=task.objective,
+                        error=f"skipped: upstream {failed} did not succeed",
+                    )
+                    remaining.discard(tid)
+                else:
+                    to_run.append(task)
+            if not to_run:
+                continue  # remaining dependents will be re-evaluated (and skipped)
+            wave = await asyncio.gather(
+                *(self._run_task(task, results) for task in to_run)
+            )
+            for tr in wave:
+                results[tr.task_id] = tr
+                remaining.discard(tr.task_id)
+        return RunState(
+            plan_id=task_list.plan_id,
+            business_context_id=business_context_id,
+            results=results,
+            open_questions=list(task_list.open_questions),
+        )
+    async def _run_task(self, task: Task, results: dict[str, TaskResult]) -> TaskResult:
+        outputs: list[ToolOutput] = []
+        for call in task.tool_calls:
+            resolved = self._resolve_args(call.args, results)
+            arg_error = self._validate_args(call.tool, resolved)
+            if arg_error is not None:
+                outputs.append(ToolOutput(tool=call.tool, kind="error", error=arg_error))
+                continue
+            outputs.append(await self._safe_invoke(call.tool, resolved))
+        status = _label(outputs)
+        error: str | None = None
+        if status == "failure":
+            errs = [o.error for o in outputs if o.kind == "error" and o.error]
+            error = errs[0] if errs else "all tool calls failed"
+        return TaskResult(
+            task_id=task.id,
+            status=status,
+            objective=task.objective,
+            outputs=outputs,
+            error=error,
+        )
+    def _resolve_args(
+        self, args: dict[str, Any], results: dict[str, TaskResult]
+    ) -> dict[str, Any]:
+        return {k: self._resolve_value(v, results) for k, v in args.items()}
+    @staticmethod
+    def _resolve_value(value: Any, results: dict[str, TaskResult]) -> Any:
+        # A data arg is exactly a "${t<id>}" placeholder (Pattern A); resolve it to
+        # the referenced task's representative output (its last ToolOutput).
+        # Materializing that envelope into a DataFrame is the invoker's job.
+        if isinstance(value, str):
+            match = _PLACEHOLDER_RE.fullmatch(value.strip())
+            if match:
+                upstream = results.get(match.group(1))
+                if upstream is None or not upstream.outputs:
+                    return None
+                return upstream.outputs[-1]
+        return value
+    def _validate_args(self, tool: str, resolved: dict[str, Any]) -> str | None:
+        spec = self._registry.get(tool)
+        if spec is None:
+            return f"tool {tool!r} not in registry"
+        required = spec.input_schema.get("required", [])
+        missing = [r for r in required if resolved.get(r) is None]
+        if missing:
+            return f"missing required arg(s): {sorted(missing)}"
+        return None
+    async def _safe_invoke(self, tool: str, args: dict[str, Any]) -> ToolOutput:
+        try:
+            return await self._invoker.invoke(tool, args)
+        except Exception as exc:  # noqa: BLE001 — backstop; the invoker is never-throw (§8.4)
+            logger.warning("tool invoker raised", tool=tool, error=str(exc))
+            return ToolOutput(tool=tool, kind="error", error=f"invoker raised: {exc}")
+def _label(outputs: list[ToolOutput]) -> TaskStatus:
+    if not outputs:
+        return "failure"
+    errors = sum(1 for o in outputs if o.kind == "error")
+    if errors == 0:
+        return "success"
+    if errors == len(outputs):
+        return "failure"
+    return "partial"