trenches

Paused

App Files Files Community

trenches / BACKEND_SUMMARY.md

Codex

sync main snapshot for HF Space

1794757 4 days ago

preview code

raw

history blame contribute delete

12.4 kB

Backend Summary

This is the backend handoff for the frontend team.

Plain-English State Model

There are five different layers of state:

world.latent_state Backend truth. Rewards and simulation logic use this.
world.latent_events Canonical hidden event chain. News, actions, asset damage, and oversight now create or update these events.
world.actor_state Lagged/public summary of the world.
observations[agent_id] What each entity actually sees. This can be partial, delayed, contradictory, and low-confidence.
belief_state[agent_id] What each entity currently believes across turns. This is persistent memory, not just the current observation. It now uses doctrine-specific priors, slow false-belief decay, and contradiction-driven revision.

The frontend should not treat those layers as interchangeable.

Real Model Behavior

Each entity has a model_bindings[agent_id] object.

That tells you:

which provider is configured
which model is configured
whether the binding is ready for inference
which tools/actions the entity is allowed to use
whether the entity is currently on real provider execution or heuristic fallback

Current behavior:

if a provider binding is ready, the backend tries real provider inference first
if that fails or returns an invalid action, the backend falls back explicitly to heuristic policy
action metadata records whether the action came from provider_inference or heuristic_fallback

Supported provider names now include:

openai
anthropic
openrouter
huggingface
ollama
vllm
custom

Hugging Face notes:

huggingface uses the HF router chat-completions endpoint
if api_key_env is not set, the backend defaults to HF_TOKEN
if TRENCHES_HF_ROUTING_POLICY is set to fastest, cheapest, or preferred, the backend appends that routing suffix to HF model names that do not already include one
the recommended deployment pattern is to store HF_TOKEN as a secret, not in repo files

Main Endpoints

Server file:

backend/src/trenches_env/server.py

Health And Capabilities

GET /healthz Returns { "status": "ok" }
GET /capabilities Returns:
- session/OpenEnv capability flags
- CORS settings
- per-entity model_bindings

Use this once at app startup.

Session Lifecycle

POST /sessions Creates a session.
POST /sessions/{session_id}/reset Resets an existing session.
GET /sessions/{session_id} Returns the latest SessionState.
POST /sessions/{session_id}/step Advances one turn.

Request body:

actions: Record<agentId, AgentAction>
external_signals: ExternalSignal[]

Response:

StepSessionResponse
- session
- oversight
- done

Live News And Reaction Timeline

POST /sessions/{session_id}/news Injects public/news signals, lets the backend resolve entity reactions, steps the world, and returns the structured reaction entry for that news event.

Request body:

signals: ExternalSignal[]
agent_ids?: string[]

Notes:

if agent_ids is omitted, all entities react
if agent_ids is provided, only those entities are auto-resolved for that news event
this still goes through the same env step path, so it stays aligned with OpenEnv behavior

Response:

IngestNewsResponse
- session
- oversight
- reaction
- done
GET /sessions/{session_id}/reactions Returns the rolling reaction_log.

Use these two endpoints for:

incoming-news timeline
“who reacted to what” UI
live world-monitoring panels

Provider Diagnostics

GET /sessions/{session_id}/providers/diagnostics Returns per-entity provider runtime health and recent inference telemetry.

Important fields per entity:

status
request_count
success_count
error_count
consecutive_failures
last_latency_ms
avg_latency_ms
last_success_at
last_error_at
last_error

Use this for:

provider health badges
fallback warnings
“model is unhealthy” operator panels
debugging why an entity is on heuristic fallback

Live Source Controls

POST /sessions/{session_id}/live Enables or disables live mode.
POST /sessions/{session_id}/sources/refresh Forces source refresh and rebuilds observations.
GET /sessions/{session_id}/sources/monitor Returns source-health and delivery status per entity.

Scenarios And Benchmarks

GET /scenarios Returns seeded scenarios.
POST /benchmarks/run Runs scenario benchmarks and returns scorecards.

OpenEnv

Legacy tuple-style endpoints:

POST /reset
POST /step
GET /state

If openenv-core is installed, native OpenEnv is mounted at:

/openenv

OpenEnv file:

backend/src/trenches_env/openenv_adapter.py

Main Schemas

Schema file:

backend/src/trenches_env/models.py

SessionState

Main top-level object for the frontend.

Important fields:

session_id
world
observations
belief_state
rewards
model_bindings
recent_traces
action_log
reaction_log
live
episode

WorldState

Important fields:

latent_state
latent_events
actor_state
active_events
asset_state
coalition_graph
risk_scores
last_actions

Important distinction:

latent_events are canonical hidden events
active_events are the public-facing projection of those latent events

AgentObservation

Main entity-facing view.

Important fields:

decision_prompt
belief_brief
belief_topics
available_actions
available_data_sources
strategic_state
strategic_assets
asset_alerts
source_packets
training_source_packets
live_source_packets
projection

ObservationProjection

This explains how messy the entity’s current view is.

Important fields:

mode
worldview_reliability
delayed_source_count
contested_source_count
contradiction_packet_count
contradiction_topics
obscured_metric_count
notes

Frontend rule:

Show this clearly. Do not present entity observations as perfect truth.

EntityModelBinding

Per-entity provider/runtime config.

Important fields:

provider
model_name
configured
ready_for_inference
decision_mode
supports_tool_calls
supports_structured_output
action_tools
observation_tools
notes

ProviderAgentDiagnostics

Per-entity runtime telemetry for provider-backed execution.

Important fields:

agent_id
provider
model_name
configured
ready_for_inference
decision_mode
status
request_count
success_count
error_count
consecutive_failures
last_latency_ms
avg_latency_ms
last_success_at
last_error_at
last_error

ActionLogEntry

Per-action activity feed row.

Important fields:

turn
actor
action_type
summary
target
reward_total
metadata

Use this for the entity activity log.

ReactionLogEntry

Structured “public release -> entity reaction” object.

Important fields:

event_id
turn
source
latent_event_ids
signals
actor_outcomes
oversight_triggered
tension_before
tension_after
market_stress_after
oil_pressure_after

This is the easiest object for a live news feed.

AgentBeliefState

Persistent per-entity memory.

Important fields:

agent_id
dominant_topics
beliefs
last_revision_turn

AgentBeliefEntry

One remembered belief/hypothesis for an entity.

Important fields:

belief_id
topic
summary
confidence
status
source
suspected_agents
related_event_ids
confirmation_count
contradiction_count
last_confirmed_turn
last_updated_turn

Belief behavior:

entities do not weight all topics equally
beliefs decay gradually when no new confirmation arrives
contradictory evidence usually downgrades a belief first before fully disconfirming it
two entities can see the same event and end up with different confidence because doctrine priors differ

Latent Events

The backend now treats event flow as first-class, not just metric movement.

Main schema:

LatentEvent

Key fields:

event_id
topic
status
severity
visibility
reliability
origin
affected_agents
affected_assets
started_at_turn
last_updated_turn
decay_rate
linked_event_ids
narratives

What this means:

scenarios can seed hidden events
incoming news creates or updates hidden events
entity actions create hidden events
linked spillover events can be spawned
public event feeds are projected from latent events
source contradictions now key off latent events, not only metric heuristics

ReactionActorOutcome

One entity’s response to one news event.

Important fields:

agent_id
action
reward_total
decision_mode

What Is Good To Go

Backend pieces that are ready for frontend integration:

session lifecycle
live source monitoring
latent truth vs public state split
latent event engine and event-driven public projection
persistent belief state per entity
doctrine-specific belief revision and false-belief persistence
contradiction-aware observation projection
per-entity rewards
per-entity action logging
structured reaction logging for public/news events
seeded scenarios
benchmark runs
provider bindings
real provider execution with explicit fallback
provider runtime diagnostics
OpenEnv-compatible environment flow

What Is Still Left

Backend

Persist replay history. recent_traces, action_log, reaction_log, and latent event evolution are still rolling in-memory state, not durable history.
Deepen the latent event graph. The event engine now exists, but it can still be improved with stronger causal chains, event merging, event resolution rules, and richer cross-front propagation.
Add event-delta summaries. A compact backend-generated turn delta would make replay/debug views much easier to build.
Keep hardening provider execution. Retries and diagnostics now exist. The next step is richer classification for rate limits, timeout classes, and provider-specific retry traces.
Add a durable event archive or export path. There is still no persistent event timeline outside in-memory session state.

Frontend

Build the app shell around:
- /capabilities
- /scenarios
- /sessions
- /sessions/{id}
- /sessions/{id}/step
- /sessions/{id}/news
- /sessions/{id}/reactions
- /sessions/{id}/providers/diagnostics
- /sessions/{id}/live
- /sessions/{id}/sources/monitor
Add entity cards that show:
- projected state
- persistent belief topics / belief memory
- reward total
- provider readiness
- provider health/latency
- latest action
- uncertainty/projection info
Add a live news/reaction timeline. Use /sessions/{id}/news for ingestion and reaction_log or /sessions/{id}/reactions for history.
Add latent event visibility to operator surfaces. Show:
- key latent event topics
- event severity
- event visibility
- linked spillovers
Add a source-health panel. Use /sessions/{id}/sources/monitor.
Add replay panels. Use recent_traces, action_log, reaction_log, and world.latent_events.
Make uncertainty visible. Show reliability, contradiction topics, delayed sources, and contested-source counts.

Rule Of Thumb For Frontend

If the UI means:

“what the entity believes” -> use session.observations[agent_id]
“what the entity currently remembers/believes across turns” -> use session.belief_state[agent_id]
“what the operator/debugger sees” -> use session.world
“what hidden developments are driving the sim” -> use session.world.latent_events
“what the backend can execute” -> use session.model_bindings
“what just happened on a turn” -> use session.action_log and session.recent_traces
“what public news triggered reactions” -> use session.reaction_log