trenches

Paused

App Files Files Community

trenches / BACKEND_SUMMARY.md

Codex

sync main snapshot for HF Space

1794757 4 days ago

preview code

raw

history blame contribute delete

12.4 kB

	# Backend Summary

	This is the backend handoff for the frontend team.

	## Plain-English State Model

	There are five different layers of state:

	1. `world.latent_state`
	Backend truth. Rewards and simulation logic use this.

	2. `world.latent_events`
	Canonical hidden event chain. News, actions, asset damage, and oversight now create or update these events.

	3. `world.actor_state`
	Lagged/public summary of the world.

	4. `observations[agent_id]`
	What each entity actually sees. This can be partial, delayed, contradictory, and low-confidence.

	5. `belief_state[agent_id]`
	What each entity currently believes across turns. This is persistent memory, not just the current observation. It now uses doctrine-specific priors, slow false-belief decay, and contradiction-driven revision.

	The frontend should not treat those layers as interchangeable.

	## Real Model Behavior

	Each entity has a `model_bindings[agent_id]` object.

	That tells you:

	- which provider is configured
	- which model is configured
	- whether the binding is ready for inference
	- which tools/actions the entity is allowed to use
	- whether the entity is currently on real provider execution or heuristic fallback

	Current behavior:

	- if a provider binding is ready, the backend tries real provider inference first
	- if that fails or returns an invalid action, the backend falls back explicitly to heuristic policy
	- action metadata records whether the action came from `provider_inference` or `heuristic_fallback`

	Supported provider names now include:

	- `openai`
	- `anthropic`
	- `openrouter`
	- `huggingface`
	- `ollama`
	- `vllm`
	- `custom`

	Hugging Face notes:

	- `huggingface` uses the HF router chat-completions endpoint
	- if `api_key_env` is not set, the backend defaults to `HF_TOKEN`
	- if `TRENCHES_HF_ROUTING_POLICY` is set to `fastest`, `cheapest`, or `preferred`, the backend appends that routing suffix to HF model names that do not already include one
	- the recommended deployment pattern is to store `HF_TOKEN` as a secret, not in repo files

	## Main Endpoints

	Server file:

	- [backend/src/trenches_env/server.py](/Users/alazarmanakelew/IdeaProjects/trenches/backend/src/trenches_env/server.py)

	### Health And Capabilities

	- `GET /healthz`
	Returns `{ "status": "ok" }`

	- `GET /capabilities`
	Returns:
	- session/OpenEnv capability flags
	- CORS settings
	- per-entity `model_bindings`

	Use this once at app startup.

	### Session Lifecycle

	- `POST /sessions`
	Creates a session.

	- `POST /sessions/{session_id}/reset`
	Resets an existing session.

	- `GET /sessions/{session_id}`
	Returns the latest `SessionState`.

	- `POST /sessions/{session_id}/step`
	Advances one turn.

	Request body:
	- `actions: Record<agentId, AgentAction>`
	- `external_signals: ExternalSignal[]`

	Response:
	- `StepSessionResponse`
	- `session`
	- `oversight`
	- `done`

	### Live News And Reaction Timeline

	- `POST /sessions/{session_id}/news`
	Injects public/news signals, lets the backend resolve entity reactions, steps the world, and returns the structured reaction entry for that news event.

	Request body:
	- `signals: ExternalSignal[]`
	- `agent_ids?: string[]`

	Notes:
	- if `agent_ids` is omitted, all entities react
	- if `agent_ids` is provided, only those entities are auto-resolved for that news event
	- this still goes through the same env step path, so it stays aligned with OpenEnv behavior

	Response:
	- `IngestNewsResponse`
	- `session`
	- `oversight`
	- `reaction`
	- `done`

	- `GET /sessions/{session_id}/reactions`
	Returns the rolling `reaction_log`.

	Use these two endpoints for:

	- incoming-news timeline
	- “who reacted to what” UI
	- live world-monitoring panels

	### Provider Diagnostics

	- `GET /sessions/{session_id}/providers/diagnostics`
	Returns per-entity provider runtime health and recent inference telemetry.

	Important fields per entity:

	- `status`
	- `request_count`
	- `success_count`
	- `error_count`
	- `consecutive_failures`
	- `last_latency_ms`
	- `avg_latency_ms`
	- `last_success_at`
	- `last_error_at`
	- `last_error`

	Use this for:

	- provider health badges
	- fallback warnings
	- “model is unhealthy” operator panels
	- debugging why an entity is on heuristic fallback

	### Live Source Controls

	- `POST /sessions/{session_id}/live`
	Enables or disables live mode.

	- `POST /sessions/{session_id}/sources/refresh`
	Forces source refresh and rebuilds observations.

	- `GET /sessions/{session_id}/sources/monitor`
	Returns source-health and delivery status per entity.

	### Scenarios And Benchmarks

	- `GET /scenarios`
	Returns seeded scenarios.

	- `POST /benchmarks/run`
	Runs scenario benchmarks and returns scorecards.

	### OpenEnv

	Legacy tuple-style endpoints:

	- `POST /reset`
	- `POST /step`
	- `GET /state`

	If `openenv-core` is installed, native OpenEnv is mounted at:

	- `/openenv`

	OpenEnv file:

	- [backend/src/trenches_env/openenv_adapter.py](/Users/alazarmanakelew/IdeaProjects/trenches/backend/src/trenches_env/openenv_adapter.py)

	## Main Schemas

	Schema file:

	- [backend/src/trenches_env/models.py](/Users/alazarmanakelew/IdeaProjects/trenches/backend/src/trenches_env/models.py)

	### SessionState

	Main top-level object for the frontend.

	Important fields:

	- `session_id`
	- `world`
	- `observations`
	- `belief_state`
	- `rewards`
	- `model_bindings`
	- `recent_traces`
	- `action_log`
	- `reaction_log`
	- `live`
	- `episode`

	### WorldState

	Important fields:

	- `latent_state`
	- `latent_events`
	- `actor_state`
	- `active_events`
	- `asset_state`
	- `coalition_graph`
	- `risk_scores`
	- `last_actions`

	Important distinction:

	- `latent_events` are canonical hidden events
	- `active_events` are the public-facing projection of those latent events

	### AgentObservation

	Main entity-facing view.

	Important fields:

	- `decision_prompt`
	- `belief_brief`
	- `belief_topics`
	- `available_actions`
	- `available_data_sources`
	- `strategic_state`
	- `strategic_assets`
	- `asset_alerts`
	- `source_packets`
	- `training_source_packets`
	- `live_source_packets`
	- `projection`

	### ObservationProjection

	This explains how messy the entity’s current view is.

	Important fields:

	- `mode`
	- `worldview_reliability`
	- `delayed_source_count`
	- `contested_source_count`
	- `contradiction_packet_count`
	- `contradiction_topics`
	- `obscured_metric_count`
	- `notes`

	Frontend rule:

	Show this clearly. Do not present entity observations as perfect truth.

	### EntityModelBinding

	Per-entity provider/runtime config.

	Important fields:

	- `provider`
	- `model_name`
	- `configured`
	- `ready_for_inference`
	- `decision_mode`
	- `supports_tool_calls`
	- `supports_structured_output`
	- `action_tools`
	- `observation_tools`
	- `notes`

	### ProviderAgentDiagnostics

	Per-entity runtime telemetry for provider-backed execution.

	Important fields:

	- `agent_id`
	- `provider`
	- `model_name`
	- `configured`
	- `ready_for_inference`
	- `decision_mode`
	- `status`
	- `request_count`
	- `success_count`
	- `error_count`
	- `consecutive_failures`
	- `last_latency_ms`
	- `avg_latency_ms`
	- `last_success_at`
	- `last_error_at`
	- `last_error`

	### ActionLogEntry

	Per-action activity feed row.

	Important fields:

	- `turn`
	- `actor`
	- `action_type`
	- `summary`
	- `target`
	- `reward_total`
	- `metadata`

	Use this for the entity activity log.

	### ReactionLogEntry

	Structured “public release -> entity reaction” object.

	Important fields:

	- `event_id`
	- `turn`
	- `source`
	- `latent_event_ids`
	- `signals`
	- `actor_outcomes`
	- `oversight_triggered`
	- `tension_before`
	- `tension_after`
	- `market_stress_after`
	- `oil_pressure_after`

	This is the easiest object for a live news feed.

	### AgentBeliefState

	Persistent per-entity memory.

	Important fields:

	- `agent_id`
	- `dominant_topics`
	- `beliefs`
	- `last_revision_turn`

	### AgentBeliefEntry

	One remembered belief/hypothesis for an entity.

	Important fields:

	- `belief_id`
	- `topic`
	- `summary`
	- `confidence`
	- `status`
	- `source`
	- `suspected_agents`
	- `related_event_ids`
	- `confirmation_count`
	- `contradiction_count`
	- `last_confirmed_turn`
	- `last_updated_turn`

	Belief behavior:

	- entities do not weight all topics equally
	- beliefs decay gradually when no new confirmation arrives
	- contradictory evidence usually downgrades a belief first before fully disconfirming it
	- two entities can see the same event and end up with different confidence because doctrine priors differ

	### Latent Events

	The backend now treats event flow as first-class, not just metric movement.

	Main schema:

	- `LatentEvent`

	Key fields:

	- `event_id`
	- `topic`
	- `status`
	- `severity`
	- `visibility`
	- `reliability`
	- `origin`
	- `affected_agents`
	- `affected_assets`
	- `started_at_turn`
	- `last_updated_turn`
	- `decay_rate`
	- `linked_event_ids`
	- `narratives`

	What this means:

	- scenarios can seed hidden events
	- incoming news creates or updates hidden events
	- entity actions create hidden events
	- linked spillover events can be spawned
	- public event feeds are projected from latent events
	- source contradictions now key off latent events, not only metric heuristics

	### ReactionActorOutcome

	One entity’s response to one news event.

	Important fields:

	- `agent_id`
	- `action`
	- `reward_total`
	- `decision_mode`

	## What Is Good To Go

	Backend pieces that are ready for frontend integration:

	- session lifecycle
	- live source monitoring
	- latent truth vs public state split
	- latent event engine and event-driven public projection
	- persistent belief state per entity
	- doctrine-specific belief revision and false-belief persistence
	- contradiction-aware observation projection
	- per-entity rewards
	- per-entity action logging
	- structured reaction logging for public/news events
	- seeded scenarios
	- benchmark runs
	- provider bindings
	- real provider execution with explicit fallback
	- provider runtime diagnostics
	- OpenEnv-compatible environment flow

	## What Is Still Left

	### Backend

	1. Persist replay history.
	`recent_traces`, `action_log`, `reaction_log`, and latent event evolution are still rolling in-memory state, not durable history.

	2. Deepen the latent event graph.
	The event engine now exists, but it can still be improved with stronger causal chains, event merging, event resolution rules, and richer cross-front propagation.

	3. Add event-delta summaries.
	A compact backend-generated turn delta would make replay/debug views much easier to build.

	4. Keep hardening provider execution.
	Retries and diagnostics now exist. The next step is richer classification for rate limits, timeout classes, and provider-specific retry traces.

	5. Add a durable event archive or export path.
	There is still no persistent event timeline outside in-memory session state.

	### Frontend

	1. Build the app shell around:
	- `/capabilities`
	- `/scenarios`
	- `/sessions`
	- `/sessions/{id}`
	- `/sessions/{id}/step`
	- `/sessions/{id}/news`
	- `/sessions/{id}/reactions`
	- `/sessions/{id}/providers/diagnostics`
	- `/sessions/{id}/live`
	- `/sessions/{id}/sources/monitor`

	2. Add entity cards that show:
	- projected state
	- persistent belief topics / belief memory
	- reward total
	- provider readiness
	- provider health/latency
	- latest action
	- uncertainty/projection info

	3. Add a live news/reaction timeline.
	Use `/sessions/{id}/news` for ingestion and `reaction_log` or `/sessions/{id}/reactions` for history.

	4. Add latent event visibility to operator surfaces.
	Show:
	- key latent event topics
	- event severity
	- event visibility
	- linked spillovers

	5. Add a source-health panel.
	Use `/sessions/{id}/sources/monitor`.

	6. Add replay panels.
	Use `recent_traces`, `action_log`, `reaction_log`, and `world.latent_events`.

	7. Make uncertainty visible.
	Show reliability, contradiction topics, delayed sources, and contested-source counts.

	## Rule Of Thumb For Frontend

	If the UI means:

	- “what the entity believes” -> use `session.observations[agent_id]`
	- “what the entity currently remembers/believes across turns” -> use `session.belief_state[agent_id]`
	- “what the operator/debugger sees” -> use `session.world`
	- “what hidden developments are driving the sim” -> use `session.world.latent_events`
	- “what the backend can execute” -> use `session.model_bindings`
	- “what just happened on a turn” -> use `session.action_log` and `session.recent_traces`
	- “what public news triggered reactions” -> use `session.reaction_log`