replicalab / tests

Commit History

Add local H100 scientist eval tooling
a29a83d

ayushozha commited on

Add 12 scoring & environment improvements with full test coverage
2b98daa

ayushozha Claude Opus 4.6 commited on

Add model-driven local runtime and dynamic demo flow
4ea916a

ayushozha commited on

Finalize demo flow and training assets
b878f5b

ayushozha commited on

Merge Kush frontend integration, close API 16/UI 10/UI 11
abb29f8

ayushozha commited on

Add API 19 /web fallback route, merge Kush frontend, close UI 07
b001a03

ayushozha commited on

Add ENV 09 disk persistence, OBS 07/09, TST 11 audit tests, close 10 Max tasks
11faa95

ayushozha commited on

Add MOD 08 schema tests, V2 training stack, and close MOD 08/JDG 07/API 01/OBS 02
685783a

ayushozha commited on

Add JDG 07: reward breakdown logging to CSV and JSONL per episode
82805bf

ayushozha commited on

Add hybrid Oracle layer and update architecture docs
ec2e890

ayushozha commited on

Recover env judge training stack and sync project tracking
13ae015

ayushozha commited on

Recover env server client stack and deployment tracking
8b157ab

ayushozha commited on

Add deterministic judge scoring engine (JDG 01-03)
e50dca9

ayushozha Claude Opus 4.6 commited on

Add AGT 04/05/07 implementations, server integration, and doc updates
7c2246c

ayushozha Claude Opus 4.6 commited on

Add deterministic alternative suggestion logic for Lab Manager (AGT 06)
87a89f0

ayushozha Claude Opus 4.6 commited on

Add parse-and-retry loop for Scientist agent with telemetry (AGT 03)
19ab848

ayushozha Claude Opus 4.6 commited on

Add per-turn observation formatter for Scientist agent (AGT 02)
528bd7d

ayushozha Claude Opus 4.6 commited on

Add deterministic protocol validation against scenario constraints (MOD 05)
e239665

ayushozha Claude Opus 4.6 commited on

Type EpisodeState and EpisodeLog with Protocol, ConversationEntry, RewardBreakdown (MOD 04)
2f4ed4a

ayushozha Claude Opus 4.6 commited on

Add MOD 11 typed StepInfo/RewardBreakdown and import completed foundation modules
5f8c92c

ayushozha Claude Opus 4.6 commited on

Add future improvements doc, import FND 09 and MOD 01-03 completions
db07024

ayushozha Claude Opus 4.6 commited on

Complete FND 01 and FND 10, update task division with status tracking
8a624de

ayushozha commited on