trenches / BACKEND_SUMMARY.md
Codex
sync main snapshot for HF Space
1794757

Backend Summary

This is the backend handoff for the frontend team.

Plain-English State Model

There are five different layers of state:

  1. world.latent_state Backend truth. Rewards and simulation logic use this.

  2. world.latent_events Canonical hidden event chain. News, actions, asset damage, and oversight now create or update these events.

  3. world.actor_state Lagged/public summary of the world.

  4. observations[agent_id] What each entity actually sees. This can be partial, delayed, contradictory, and low-confidence.

  5. belief_state[agent_id] What each entity currently believes across turns. This is persistent memory, not just the current observation. It now uses doctrine-specific priors, slow false-belief decay, and contradiction-driven revision.

The frontend should not treat those layers as interchangeable.

Real Model Behavior

Each entity has a model_bindings[agent_id] object.

That tells you:

  • which provider is configured
  • which model is configured
  • whether the binding is ready for inference
  • which tools/actions the entity is allowed to use
  • whether the entity is currently on real provider execution or heuristic fallback

Current behavior:

  • if a provider binding is ready, the backend tries real provider inference first
  • if that fails or returns an invalid action, the backend falls back explicitly to heuristic policy
  • action metadata records whether the action came from provider_inference or heuristic_fallback

Supported provider names now include:

  • openai
  • anthropic
  • openrouter
  • huggingface
  • ollama
  • vllm
  • custom

Hugging Face notes:

  • huggingface uses the HF router chat-completions endpoint
  • if api_key_env is not set, the backend defaults to HF_TOKEN
  • if TRENCHES_HF_ROUTING_POLICY is set to fastest, cheapest, or preferred, the backend appends that routing suffix to HF model names that do not already include one
  • the recommended deployment pattern is to store HF_TOKEN as a secret, not in repo files

Main Endpoints

Server file:

Health And Capabilities

  • GET /healthz Returns { "status": "ok" }

  • GET /capabilities Returns:

    • session/OpenEnv capability flags
    • CORS settings
    • per-entity model_bindings

Use this once at app startup.

Session Lifecycle

  • POST /sessions Creates a session.

  • POST /sessions/{session_id}/reset Resets an existing session.

  • GET /sessions/{session_id} Returns the latest SessionState.

  • POST /sessions/{session_id}/step Advances one turn.

Request body:

  • actions: Record<agentId, AgentAction>
  • external_signals: ExternalSignal[]

Response:

  • StepSessionResponse
    • session
    • oversight
    • done

Live News And Reaction Timeline

  • POST /sessions/{session_id}/news Injects public/news signals, lets the backend resolve entity reactions, steps the world, and returns the structured reaction entry for that news event.

Request body:

  • signals: ExternalSignal[]
  • agent_ids?: string[]

Notes:

  • if agent_ids is omitted, all entities react
  • if agent_ids is provided, only those entities are auto-resolved for that news event
  • this still goes through the same env step path, so it stays aligned with OpenEnv behavior

Response:

  • IngestNewsResponse

    • session
    • oversight
    • reaction
    • done
  • GET /sessions/{session_id}/reactions Returns the rolling reaction_log.

Use these two endpoints for:

  • incoming-news timeline
  • “who reacted to what” UI
  • live world-monitoring panels

Provider Diagnostics

  • GET /sessions/{session_id}/providers/diagnostics Returns per-entity provider runtime health and recent inference telemetry.

Important fields per entity:

  • status
  • request_count
  • success_count
  • error_count
  • consecutive_failures
  • last_latency_ms
  • avg_latency_ms
  • last_success_at
  • last_error_at
  • last_error

Use this for:

  • provider health badges
  • fallback warnings
  • “model is unhealthy” operator panels
  • debugging why an entity is on heuristic fallback

Live Source Controls

  • POST /sessions/{session_id}/live Enables or disables live mode.

  • POST /sessions/{session_id}/sources/refresh Forces source refresh and rebuilds observations.

  • GET /sessions/{session_id}/sources/monitor Returns source-health and delivery status per entity.

Scenarios And Benchmarks

  • GET /scenarios Returns seeded scenarios.

  • POST /benchmarks/run Runs scenario benchmarks and returns scorecards.

OpenEnv

Legacy tuple-style endpoints:

  • POST /reset
  • POST /step
  • GET /state

If openenv-core is installed, native OpenEnv is mounted at:

  • /openenv

OpenEnv file:

Main Schemas

Schema file:

SessionState

Main top-level object for the frontend.

Important fields:

  • session_id
  • world
  • observations
  • belief_state
  • rewards
  • model_bindings
  • recent_traces
  • action_log
  • reaction_log
  • live
  • episode

WorldState

Important fields:

  • latent_state
  • latent_events
  • actor_state
  • active_events
  • asset_state
  • coalition_graph
  • risk_scores
  • last_actions

Important distinction:

  • latent_events are canonical hidden events
  • active_events are the public-facing projection of those latent events

AgentObservation

Main entity-facing view.

Important fields:

  • decision_prompt
  • belief_brief
  • belief_topics
  • available_actions
  • available_data_sources
  • strategic_state
  • strategic_assets
  • asset_alerts
  • source_packets
  • training_source_packets
  • live_source_packets
  • projection

ObservationProjection

This explains how messy the entity’s current view is.

Important fields:

  • mode
  • worldview_reliability
  • delayed_source_count
  • contested_source_count
  • contradiction_packet_count
  • contradiction_topics
  • obscured_metric_count
  • notes

Frontend rule:

Show this clearly. Do not present entity observations as perfect truth.

EntityModelBinding

Per-entity provider/runtime config.

Important fields:

  • provider
  • model_name
  • configured
  • ready_for_inference
  • decision_mode
  • supports_tool_calls
  • supports_structured_output
  • action_tools
  • observation_tools
  • notes

ProviderAgentDiagnostics

Per-entity runtime telemetry for provider-backed execution.

Important fields:

  • agent_id
  • provider
  • model_name
  • configured
  • ready_for_inference
  • decision_mode
  • status
  • request_count
  • success_count
  • error_count
  • consecutive_failures
  • last_latency_ms
  • avg_latency_ms
  • last_success_at
  • last_error_at
  • last_error

ActionLogEntry

Per-action activity feed row.

Important fields:

  • turn
  • actor
  • action_type
  • summary
  • target
  • reward_total
  • metadata

Use this for the entity activity log.

ReactionLogEntry

Structured “public release -> entity reaction” object.

Important fields:

  • event_id
  • turn
  • source
  • latent_event_ids
  • signals
  • actor_outcomes
  • oversight_triggered
  • tension_before
  • tension_after
  • market_stress_after
  • oil_pressure_after

This is the easiest object for a live news feed.

AgentBeliefState

Persistent per-entity memory.

Important fields:

  • agent_id
  • dominant_topics
  • beliefs
  • last_revision_turn

AgentBeliefEntry

One remembered belief/hypothesis for an entity.

Important fields:

  • belief_id
  • topic
  • summary
  • confidence
  • status
  • source
  • suspected_agents
  • related_event_ids
  • confirmation_count
  • contradiction_count
  • last_confirmed_turn
  • last_updated_turn

Belief behavior:

  • entities do not weight all topics equally
  • beliefs decay gradually when no new confirmation arrives
  • contradictory evidence usually downgrades a belief first before fully disconfirming it
  • two entities can see the same event and end up with different confidence because doctrine priors differ

Latent Events

The backend now treats event flow as first-class, not just metric movement.

Main schema:

  • LatentEvent

Key fields:

  • event_id
  • topic
  • status
  • severity
  • visibility
  • reliability
  • origin
  • affected_agents
  • affected_assets
  • started_at_turn
  • last_updated_turn
  • decay_rate
  • linked_event_ids
  • narratives

What this means:

  • scenarios can seed hidden events
  • incoming news creates or updates hidden events
  • entity actions create hidden events
  • linked spillover events can be spawned
  • public event feeds are projected from latent events
  • source contradictions now key off latent events, not only metric heuristics

ReactionActorOutcome

One entity’s response to one news event.

Important fields:

  • agent_id
  • action
  • reward_total
  • decision_mode

What Is Good To Go

Backend pieces that are ready for frontend integration:

  • session lifecycle
  • live source monitoring
  • latent truth vs public state split
  • latent event engine and event-driven public projection
  • persistent belief state per entity
  • doctrine-specific belief revision and false-belief persistence
  • contradiction-aware observation projection
  • per-entity rewards
  • per-entity action logging
  • structured reaction logging for public/news events
  • seeded scenarios
  • benchmark runs
  • provider bindings
  • real provider execution with explicit fallback
  • provider runtime diagnostics
  • OpenEnv-compatible environment flow

What Is Still Left

Backend

  1. Persist replay history. recent_traces, action_log, reaction_log, and latent event evolution are still rolling in-memory state, not durable history.

  2. Deepen the latent event graph. The event engine now exists, but it can still be improved with stronger causal chains, event merging, event resolution rules, and richer cross-front propagation.

  3. Add event-delta summaries. A compact backend-generated turn delta would make replay/debug views much easier to build.

  4. Keep hardening provider execution. Retries and diagnostics now exist. The next step is richer classification for rate limits, timeout classes, and provider-specific retry traces.

  5. Add a durable event archive or export path. There is still no persistent event timeline outside in-memory session state.

Frontend

  1. Build the app shell around:

    • /capabilities
    • /scenarios
    • /sessions
    • /sessions/{id}
    • /sessions/{id}/step
    • /sessions/{id}/news
    • /sessions/{id}/reactions
    • /sessions/{id}/providers/diagnostics
    • /sessions/{id}/live
    • /sessions/{id}/sources/monitor
  2. Add entity cards that show:

    • projected state
    • persistent belief topics / belief memory
    • reward total
    • provider readiness
    • provider health/latency
    • latest action
    • uncertainty/projection info
  3. Add a live news/reaction timeline. Use /sessions/{id}/news for ingestion and reaction_log or /sessions/{id}/reactions for history.

  4. Add latent event visibility to operator surfaces. Show:

    • key latent event topics
    • event severity
    • event visibility
    • linked spillovers
  5. Add a source-health panel. Use /sessions/{id}/sources/monitor.

  6. Add replay panels. Use recent_traces, action_log, reaction_log, and world.latent_events.

  7. Make uncertainty visible. Show reliability, contradiction topics, delayed sources, and contested-source counts.

Rule Of Thumb For Frontend

If the UI means:

  • “what the entity believes” -> use session.observations[agent_id]
  • “what the entity currently remembers/believes across turns” -> use session.belief_state[agent_id]
  • “what the operator/debugger sees” -> use session.world
  • “what hidden developments are driving the sim” -> use session.world.latent_events
  • “what the backend can execute” -> use session.model_bindings
  • “what just happened on a turn” -> use session.action_log and session.recent_traces
  • “what public news triggered reactions” -> use session.reaction_log