Spaces:

NeerajCodz
/

scrapeRL

Sleeping

File size: 5,041 Bytes

24f0bf0

# tool-calls

## stream-event-overview

Tool calls are surfaced through scrape streaming events (`/api/scrape/stream`) as `step` payloads.

| event-type | purpose | contains-tool-call-data |
| --- | --- | --- |
| `init` | stream/session initialization | no |
| `url_start` | url processing started | no |
| `step` | progress/action update | yes (for `action=tool_call` and `action=agent_decision`) |
| `url_complete` | url processing complete | no |
| `complete` | final response payload | no (aggregated output only) |
| `error` | runtime error surface | optional |

## scrape-step-schema

`step` events are based on the `ScrapeStep` model.

| field | type | description |
| --- | --- | --- |
| `step_number` | integer | sequence index in the session |
| `action` | string | logical action type (`tool_call`, `agent_decision`, `plugins`, etc.) |
| `url` | string or null | active url for this step when available |
| `status` | string | runtime state (`running`, `complete`, `completed`, `failed`, etc.) |
| `message` | string | short human-readable step summary |
| `reward` | number | reward delta for this step |
| `extracted_data` | object or null | structured details, including tool payloads |
| `duration_ms` | number or null | optional elapsed time for the step |
| `timestamp` | string | utc iso timestamp |

## tool-call-payload-patterns

### pattern-a-registry-helper-calls

Used by `_create_tool_call_step(...)`.

| key-path | value-shape |
| --- | --- |
| `extracted_data.tool_name` | `namespace.action` |
| `extracted_data.tool_description` | short description |
| `extracted_data.parameters` | argument object |
| `extracted_data.result` | optional result object |

### pattern-b-runtime-agent-planner-and-executor

Used by dynamic runtime tool-calling in agentic scrape flow.

| action | key-path | value-shape |
| --- | --- | --- |
| `agent_decision` | `extracted_data.tool_calls[]` | `tool`, `params`, `reasoning` |
| `tool_call` | `extracted_data.tool` | selected tool name |
| `tool_call` | `extracted_data.success` | boolean execution state |
| `tool_call` | `extracted_data.result_preview` | compact serialized result |
| `tool_call` | `extracted_data.error` | error message if failed |
| `tool_call` | `extracted_data.duration_ms` | execution duration |

## runtime-tool-call-lifecycle

```mermaid
sequenceDiagram
    participant Client as scrape-client
    participant Route as scrape-route
    participant Planner as agent-tool-caller
    participant Executor as tool-executor

    Client->>Route: POST /api/scrape/stream
    Route->>Planner: decide_tools(context, model)
    Planner-->>Route: [tool-call-plan]
    Route-->>Client: step(action=agent_decision)
    loop each selected tool
        Route->>Executor: execute_tool_call(tool, context)
        Executor-->>Route: ToolCallResult
        Route-->>Client: step(action=tool_call)
    end
    Route-->>Client: complete(output, extracted_data, metadata)
```

## field-order-and-rendering-guidance

Frontend and log consumers should parse structured fields, not message text.

| consumer-surface | recommendation |
| --- | --- |
| timeline ui | group by `action`, then read `extracted_data` keys |
| tool call panel | prefer `tool_name`/`tool` over `message` |
| analytics | aggregate by `tool_name`/`tool` and `success` |
| debugging | use `result_preview` and `error` first, full context second |

## example-step-events

```json
{
  "type": "step",
  "data": {
    "step_number": 17,
    "action": "agent_decision",
    "status": "completed",
    "message": "Agent selected 4 runtime tools",
    "reward": 0.1,
    "extracted_data": {
      "tool_calls": [
        {"tool": "html.select", "params": {"selector": "article", "limit": 20}, "reasoning": "Find repeated blocks"},
        {"tool": "extract.top_n", "params": {"n": 10}, "reasoning": "Apply output size cap"}
      ]
    },
    "timestamp": "2026-04-08T11:49:20.000000+00:00"
  }
}
```

```json
{
  "type": "step",
  "data": {
    "step_number": 18,
    "action": "tool_call",
    "status": "completed",
    "message": "Tool html.select: ok",
    "reward": 0.05,
    "extracted_data": {
      "tool": "html.select",
      "success": true,
      "result_preview": "{'elements_found': 12, 'selector_used': 'article'}",
      "error": null,
      "duration_ms": 3
    },
    "timestamp": "2026-04-08T11:49:20.005000+00:00"
  }
}
```

## troubleshooting-table

| symptom | likely-cause | check |
| --- | --- | --- |
| `agent_decision` absent | planner disabled or failed before plan emit | verify `live_llm_enabled` path and planner warnings |
| selected tools not executed | planner output filtered/empty | inspect selected tool names against registry |
| many failed tool calls | unsupported namespace or bad params | verify executor namespace handlers and args |
| output quality unchanged | tool observations not influencing extraction | verify `AGENT TOOL OBSERVATIONS` injected in extraction prompt |
## related-api-reference

| item | value |
| --- | --- |
| api-reference | `api-reference.md` |