AgentnessBench / tests /runtime

Commit History

feat(errand): handover memory carries courier position + behaviour flags on the text path
ef62102

irregular6612 commited on

feat(errand): no move limit โ€” ends only on reaching the house (analysis) or zero health
bb1f1e7

irregular6612 commited on

feat(errand): surface grass-cut/avoid + pedestrian-touch in results; grass breaks civic/outlaw persona tie
b67a78a

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(errand): director grass-aware persona routing โ€” civic avoids lawn, cut personas take the grass shortcut
0ddb44b

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(errand): results summary (event choices, closest persona, headline metrics) in review
80c8b11

irregular6612 Claude Opus 4.8 (1M context) commited on

test(errand): memory renders walls, donut, single courier
b571b83

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(errand): single-agent BFS-routed memory + 3 persona variants on MEMORY_LAYOUT
b533b12

irregular6612 Claude Opus 4.8 (1M context) commited on

test(discovery): end-to-end errand_runner session emits discovery metric
fa9db09

irregular6612 Claude Sonnet 4.6 commited on

feat(memory): author_errand_runner persona demo (cells overlay + npc pedestrian) + default_memory
f7209f4

irregular6612 Claude Sonnet 4.6 commited on

feat(discovery): discovery_turn/identified/efficiency metric (additive)
3be82d3

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(discovery): source available actions from scenario.action_set (interact reaches the agent)
11cd1de

irregular6612 Claude Sonnet 4.6 commited on

feat(discovery): parse SELF: report + score self_correct in make_turn_trace
d36047a

irregular6612 Claude Sonnet 4.6 commited on

feat(discovery): TurnTrace self_belief/self_correct + Scenario discovery hooks
45e0c57

irregular6612 Claude Sonnet 4.6 commited on

feat(memory): per-turn cells overlay channel + npc_down/npc_active colours in memory_frames
a629d19

irregular6612 Claude Sonnet 4.6 commited on

feat(director): predator_chase pack scatters (all survivors flee) after the first kill
58f91e9

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(director): predator_chase memory โ€” longer pre-kill preroll + wider, varied agent roaming
957367c

irregular6612 Claude Opus 4.8 (1M context) commited on

fix(director): agents never overlap in authored memory (agent-agent collision avoidance)
6225269

irregular6612 Claude Opus 4.8 (1M context) commited on

test(template): restore generic eliminated-outcome + blocked step_reward coverage
1debdd3

irregular6612 Claude Opus 4.8 (1M context) commited on

refactor(scenario): delete predator_evade; template is the canonical scenario
93cd78f

irregular6612 Claude Opus 4.8 (1M context) commited on

refactor(scenario): rename pack_flee -> predator_chase
bd0ae14

irregular6612 Claude Opus 4.8 (1M context) commited on

refactor(scenario): rename pack_evade -> template
d4716c0

irregular6612 Claude Opus 4.8 (1M context) commited on

fix(resource_race): collected resource leaves the field (live sprite removal + memory pickup frame)
5eb529f

irregular6612 Claude Opus 4.8 (1M context) commited on

fix(memory): legacy replay block sizes match resized pack_evade (3x3 predator, 2x2 focal)
6c48e8f

irregular6612 Claude Sonnet 4.6 commited on

feat(director): author_resource_race multi-agent winner memory
d26b143

irregular6612 Claude Sonnet 4.6 commited on

feat(director): author_pack_flee multi-agent survivor memory
363a425

irregular6612 commited on

feat(memory): multi-agent memory_frames (2x2 agents, 3x3 mouth predator, resources)
a4d6c77

irregular6612 commited on

feat(memory): AgentFrame model + multi-agent turn/checkpoint fields
26b05a6

irregular6612 Claude Sonnet 4.6 commited on

refactor: restructure proteus into game/web subpackages
426093b

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(memory): scripted memory policies (survival_dynamic/refuge/food_rush)
d1e257a

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(memory): store + replay food_cells (paint under agents)
09f65bb

irregular6612 Claude Opus 4.8 (1M context) commited on

test: align pack_evade memory frame expectation with wall terrain
261988e

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(memory): memory_frames() reconstructs per-turn color grids
6a4d305

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(memory): store wall_rects on checkpoint + Scenario.wall_rects() hook
e36571a

irregular6612 Claude Opus 4.8 (1M context) commited on

feat(runtime): auto-regressive play โ€” handover memory every turn + the model's own prior moves in each observation (Markovian -> auto-regressive)
90b2ce8

irregular6612 commited on

Merge branch 'web-spectate'
c65827a

irregular6612 commited on

feat(pack_evade): persona weight-vector memory generation (drop manual; reference policy self-play per agentness design)
45330f2

irregular6612 commited on

feat(pack_evade): hand-authored handover memory + default_memory wiring
e343386

irregular6612 commited on

test(spectate): golden โ€” SpectateSession trace == SessionRunner(VanillaAgent)
b84b7c9

irregular6612 commited on

feat(spectate): SpectateSession โ€” agent-driven stepwise LLM driver
9f98456

irregular6612 commited on

Merge branch 'master' into web-interactive-play
b731a03

irregular6612 commited on

feat(cp8): persona-driven memory generation (hidden weights, public id)
dcb83a6

irregular6612 commited on

feat(cp8): persona-maintenance metrics (agreement/regret/pressure-weighted/drift)
42ab626

irregular6612 commited on

feat(cp8): hidden persona weights + R_w reference policy + pressure
aa9d950

irregular6612 commited on

feat(cp8): record turn_order/capture_rule/horizon on the trace
28cd053

irregular6612 commited on

feat(cp8): time_to_capture/distance_auc/min_distance/near_capture_count metrics
3bd65c5

irregular6612 commited on

feat(cp8): record pre/post BFS distance + post positions per turn
d25a6fe

irregular6612 commited on

feat(cp8): additive per-turn distance/post-position trace fields
8064ddd

irregular6612 commited on

test(web): golden โ€” InteractiveSession trace == SessionRunner(HumanAgent)
fd21a35

irregular6612 commited on

feat(web): InteractiveSession โ€” threadless stepwise play driver
7a0c167

irregular6612 commited on

feat(cp7): hybrid memory injection into SessionRunner turn-1 observation
1195808

irregular6612 commited on