Spaces:

AUXteam
/

pinch

Sleeping

App Files Files Community

pinch / docs /reference /scheduler.md

AUXteam

Upload folder using huggingface_hub

25b930c verified 9 days ago

preview code

raw

history blame contribute delete

12.3 kB

Scheduler And Tasks

The scheduler is an optional in-memory task queue for multi-agent coordination. It accepts tasks over /tasks, applies admission and fairness rules, then dispatches work to the same tab action executor used by the immediate browsing routes.

It does not replace the normal direct path. Routes such as POST /tabs/{id}/action still work independently.

There is no CLI scheduler command today.

Enable The Scheduler

The scheduler is off by default. Dashboard mode registers the task routes only when scheduler.enabled is true.

{
  "scheduler": {
    "enabled": true
  }
}

Scheduler Config

{
  "scheduler": {
    "enabled": true,
    "strategy": "fair-fifo",
    "maxQueueSize": 1000,
    "maxPerAgent": 100,
    "maxInflight": 20,
    "maxPerAgentInflight": 10,
    "resultTTLSec": 300,
    "workerCount": 4
  }
}

Field	Default	Meaning
`enabled`	`false`	enables task routes in dashboard mode
`strategy`	`fair-fifo`	scheduler strategy label
`maxQueueSize`	`1000`	global queued task limit
`maxPerAgent`	`100`	queued task limit per agent
`maxInflight`	`20`	max concurrently executing tasks overall
`maxPerAgentInflight`	`10`	max concurrently executing tasks per agent
`resultTTLSec`	`300`	retention time for terminal task snapshots
`workerCount`	`4`	number of worker goroutines

Task Object

Tasks are scheduler-owned records with these main fields:

Field	Meaning
`taskId`	generated task ID
`agentId`	submitting agent identifier
`action`	action kind to run
`tabId`	target tab ID
`ref`	optional element ref
`params`	optional action-specific request fields
`priority`	lower number means higher priority
`state`	current task state
`deadline`	execution deadline
`createdAt`	submission time
`startedAt`	first execution timestamp
`completedAt`	terminal timestamp
`latencyMs`	elapsed time from start to completion
`result`	executor response payload
`error`	terminal error message
`position`	queue position at submission time
`callbackUrl`	optional webhook URL for terminal state notification

Task IDs are currently generated as tsk_XXXXXXXX, but callers should still treat them as opaque IDs.

Submit A Task

curl -X POST http://localhost:9867/tasks \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "agent-crawl-01",
    "action": "click",
    "tabId": "8f9c7d4e1234567890abcdef12345678",
    "ref": "e14",
    "priority": 5,
    "deadline": "2026-03-08T12:05:00Z"
  }'
# Response
{
  "taskId": "tsk_a1b2c3d4",
  "state": "queued",
  "position": 1,
  "createdAt": "2026-03-08T12:00:01Z"
}

This endpoint returns 202 Accepted on successful queue submission.

Request fields:

Field	Required	Notes
`agentId`	yes	validated at request time
`action`	yes	becomes the executor `kind`
`tabId`	practically yes	required by the execution path
`ref`	no	top-level element ref for element-targeted actions
`params`	no	action-specific fields merged into the executor request body
`priority`	no	lower number means higher priority
`deadline`	no	RFC3339 timestamp; defaults to `now + 60s`
`callbackUrl`	no	webhook URL; receives POST with task snapshot on terminal state

Important:

request validation enforces only agentId and action
missing tabId is rejected later during execution with tabId is required for task execution
past deadlines are rejected at submission time

Queue Full Response

If admission fails because the global queue or an agent queue is full, the scheduler returns 429 Too Many Requests.

curl -X POST http://localhost:9867/tasks \
  -H "Content-Type: application/json" \
  -d '{"agentId":"agent-crawl-01","action":"click","tabId":"8f9c7d4e1234567890abcdef12345678"}'
# Response
{
  "code": "queue_full",
  "error": "rejected: global queue full",
  "retryable": true,
  "details": {
    "agentId": "agent-crawl-01",
    "queued": 1000,
    "maxQueue": 1000,
    "maxPerAgent": 100
  }
}

List Tasks

GET /tasks returns the scheduler's in-memory task snapshots, including queued, running, and recently completed tasks that are still within the TTL window.

curl http://localhost:9867/tasks
# Response
{
  "tasks": [
    {
      "taskId": "tsk_a1b2c3d4",
      "state": "done",
      "agentId": "agent-crawl-01",
      "action": "click",
      "latencyMs": 842
    }
  ],
  "count": 1
}

Supported query filters:

agentId
state

Example:

curl 'http://localhost:9867/tasks?agentId=agent-crawl-01&state=done,failed'

Get One Task

curl http://localhost:9867/tasks/tsk_a1b2c3d4
# Response
{
  "taskId": "tsk_a1b2c3d4",
  "agentId": "agent-crawl-01",
  "action": "click",
  "tabId": "8f9c7d4e1234567890abcdef12345678",
  "ref": "e14",
  "priority": 5,
  "state": "done",
  "createdAt": "2026-03-08T12:00:01Z",
  "startedAt": "2026-03-08T12:00:01Z",
  "completedAt": "2026-03-08T12:00:02Z",
  "latencyMs": 842,
  "result": {
    "success": true
  }
}

If the task is not found, the scheduler returns:

{
  "code": "not_found",
  "error": "task not found"
}

Cancel A Task

curl -X POST http://localhost:9867/tasks/tsk_a1b2c3d4/cancel
# Response
{
  "status": "cancelled",
  "taskId": "tsk_a1b2c3d4"
}

Behavior:

queued tasks are removed from the queue
running tasks have their execution context cancelled
terminal tasks return 409 Conflict

Task States

Implemented states:

queued
assigned
running
done
failed
cancelled
rejected

Terminal states:

done
failed
cancelled
rejected

How Tasks Execute

The scheduler forwards each task to the normal tab action endpoint:

POST /tabs/{tabId}/action

It builds the action body like this:

{
  "kind": "<action>",
  "ref": "<ref>",
  "...params": "..."
}

That means:

action becomes kind
top-level ref is forwarded when present
every key in params is merged into the top-level action body

Example:

curl -X POST http://localhost:9867/tasks \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "my-agent",
    "action": "type",
    "tabId": "8f9c7d4e1234567890abcdef12345678",
    "ref": "e12",
    "params": {
      "text": "Alan Turing"
    }
  }'

In practice, task payloads should use the same action fields that the immediate /tabs/{id}/action route expects.

Fairness, Deadlines, And Retention

within one agent queue, lower priority values run first
equal-priority tasks for the same agent fall back to FIFO order
across agents, the scheduler prefers the agent with the fewest in-flight tasks
if a queued task passes its deadline before execution starts, it is marked failed with deadline exceeded while queued
terminal task snapshots are retained in memory for resultTTLSec

Phase 2 -- Observability

Scheduler Stats

GET /scheduler/stats returns a snapshot of queue state, runtime metrics, and configuration.

curl http://localhost:9867/scheduler/stats
# Response
{
  "queue": {
    "totalQueued": 5,
    "totalInflight": 2,
    "agentCounts": {
      "agent-crawl-01": 3,
      "agent-scrape-02": 2
    }
  },
  "metrics": {
    "tasksSubmitted": 42,
    "tasksCompleted": 35,
    "tasksFailed": 3,
    "tasksCancelled": 2,
    "tasksRejected": 1,
    "tasksExpired": 1,
    "dispatchCount": 38,
    "avgDispatchLatencyMs": 12.5,
    "agents": {
      "agent-crawl-01": {
        "submitted": 25,
        "completed": 22,
        "failed": 2,
        "cancelled": 1,
        "rejected": 0
      }
    }
  },
  "config": {
    "strategy": "fair-fifo",
    "maxQueueSize": 1000,
    "maxPerAgent": 100,
    "maxInflight": 20,
    "maxPerAgentFlight": 10,
    "workerCount": 4,
    "resultTTL": "5m0s"
  }
}

Metrics Fields

Field	Type	Meaning
`tasksSubmitted`	uint64	total tasks accepted since startup
`tasksCompleted`	uint64	tasks that finished successfully
`tasksFailed`	uint64	tasks that finished with an error
`tasksCancelled`	uint64	tasks cancelled via `POST /tasks/{id}/cancel`
`tasksRejected`	uint64	tasks rejected at admission (queue full)
`tasksExpired`	uint64	queued tasks that exceeded their deadline
`dispatchCount`	uint64	number of tasks dispatched to workers
`avgDispatchLatencyMs`	float64	average time from queue entry to dispatch start
`agents`	object	per-agent breakdown (submitted, completed, failed, cancelled, rejected)

Webhook Callbacks

Tasks can include a callbackUrl field. When the task reaches a terminal state (done, failed, or cancelled), the scheduler delivers a POST with the task snapshot to that URL.

curl -X POST http://localhost:9867/tasks \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "my-agent",
    "action": "click",
    "tabId": "8f9c7d4e1234567890abcdef12345678",
    "callbackUrl": "https://example.com/hooks/task-done"
  }'

Webhook behavior:

delivery is best-effort: failures are logged but do not affect task state
only http and https schemes are allowed (SSRF protection)
a dedicated HTTP client with a 10-second timeout is used
custom headers are sent: X-PinchTab-Event: task.completed and X-PinchTab-Task-ID: <taskId>

The callbackUrl field is stored on the task and returned in GET /tasks/{id}.

Phase 3 -- Hardening

Batch Task Submission

POST /tasks/batch submits multiple tasks in a single request. All tasks in the batch share the same agentId and optional callbackUrl.

curl -X POST http://localhost:9867/tasks/batch \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "agent-crawl-01",
    "callbackUrl": "https://example.com/hooks/batch",
    "tasks": [
      { "action": "click", "tabId": "TAB_ID", "params": { "selector": "#btn" } },
      { "action": "scroll", "tabId": "TAB_ID", "params": { "scrollY": 400 } },
      { "action": "hover", "tabId": "TAB_ID", "params": { "selector": "h1" }, "priority": 1 }
    ]
  }'
# Response (202 Accepted)
{
  "tasks": [
    { "taskId": "tsk_aaaa1111", "state": "queued", "position": 1 },
    { "taskId": "tsk_bbbb2222", "state": "queued", "position": 2 },
    { "taskId": "tsk_cccc3333", "state": "queued", "position": 3 }
  ],
  "submitted": 3
}

Batch Request Fields

Field	Required	Notes
`agentId`	yes	shared across all tasks in the batch
`callbackUrl`	no	webhook URL applied to every task
`tasks`	yes	array of task definitions (1–50)

Each task definition supports the same fields as a single task submit (action, tabId, ref, params, priority, deadline) except agentId and callbackUrl which are inherited from the batch.

Batch Validation

Condition	Response
missing `agentId`	`400 Bad Request`
empty `tasks` array	`400 Bad Request`
more than 50 tasks	`400 Bad Request` with `batch_too_large` code
invalid JSON body	`400 Bad Request`

Partial failure: if some tasks are rejected by admission (queue full), the accepted tasks are still submitted. The response includes each task's status individually.

Config Hot-Reload

ReloadConfig(cfg) updates queue limits, inflight limits, and result TTL at runtime without restarting the scheduler.

Reloadable fields:

Field	What Changes
`maxQueueSize`, `maxPerAgent`	queue admission limits via `SetLimits()`
`maxInflight`, `maxPerAgentFlight`	concurrency limits (protected by `cfgMu`)
`resultTTL`	result store eviction window via `SetTTL()`

Zero values are ignored (the existing setting is preserved).

ConfigWatcher

ConfigWatcher runs a background goroutine that periodically re-reads the config and calls ReloadConfig. Create with:

cw := scheduler.NewConfigWatcher(30*time.Second, loadFn, sched)
cw.Start()
defer cw.Stop()

loadFn is a func() (Config, error) that reads the current config from disk or environment.