replicalab / docs /map /README.md
maxxie114's picture
Initial HF Spaces deployment
80d8c84
# ReplicaLab Project Map
> Living reference of every module, class, function, and relationship.
> Updated after each implementation session.
>
> **Last updated:** 2026-03-07 (JDG 01-03 scoring implemented)
## Module Index
| File | What it covers |
|------|---------------|
| [models.md](models.md) | Data contracts β€” actions, observations, protocol, reward, episode state |
| [scenarios.md](scenarios.md) | Scenario generation β€” templates, constraints, resources, hidden specs |
| [agents.md](agents.md) | Agent policies β€” scientist prompt/parse/retry, lab manager feasibility/suggest/compose |
| [validation.md](validation.md) | Protocol validation β€” deterministic checks against scenario constraints |
| [scoring.md](scoring.md) | Judge scoring β€” rigor, feasibility, fidelity |
| [server.md](server.md) | FastAPI server β€” REST + WebSocket endpoints, stub environment |
| [frontend.md](frontend.md) | React UI β€” dashboard, episode viewer, components |
| [config.md](config.md) | Shared constants β€” rounds, budget, timeouts |
| [tests.md](tests.md) | Test coverage β€” 87 tests across 6 files |
## Dependency Graph
```
server/app.py
β”œβ”€β”€ replicalab.config
β”œβ”€β”€ replicalab.models
β”œβ”€β”€ replicalab.scenarios (generate_scenario, available_scenario_families)
└── replicalab.agents (check_feasibility, suggest_alternative, compose_lab_manager_response)
replicalab/agents/scientist_policy.py
β”œβ”€β”€ replicalab.models (ScientistAction, ScientistObservation, Protocol, ConversationEntry)
└── replicalab.scenarios (NormalizedScenarioPack)
replicalab/agents/lab_manager_policy.py
β”œβ”€β”€ replicalab.models (LabManagerAction, LabManagerActionType, Protocol)
β”œβ”€β”€ replicalab.scenarios (NormalizedScenarioPack)
└── replicalab.utils.validation (ValidationResult, validate_protocol)
replicalab/scenarios/templates.py
β”œβ”€β”€ replicalab.config (MAX_BUDGET, MAX_ROUNDS)
β”œβ”€β”€ replicalab.models (ScientistObservation, LabManagerObservation)
β”œβ”€β”€ replicalab.scenarios.{math_reasoning, ml_benchmark, finance_trading}
└── replicalab.utils.seed (seed_rng)
replicalab/utils/validation.py
β”œβ”€β”€ replicalab.models (Protocol)
└── replicalab.scenarios.templates (NormalizedScenarioPack)
replicalab/scoring/
β”œβ”€β”€ replicalab.models (Protocol, RewardBreakdown)
β”œβ”€β”€ replicalab.scenarios (NormalizedScenarioPack, HiddenReferenceSpec)
β”œβ”€β”€ replicalab.agents.lab_manager_policy (check_feasibility, FeasibilityCheckResult)
└── replicalab.utils.text (element_tokens, normalize_label)
```
## File Tree (implemented only)
```
replicalab/
β”œβ”€β”€ __init__.py (empty)
β”œβ”€β”€ config.py (shared constants)
β”œβ”€β”€ models.py (25 classes β€” all data contracts)
β”œβ”€β”€ agents/
β”‚ β”œβ”€β”€ __init__.py (re-exports from submodules)
β”‚ β”œβ”€β”€ scientist_policy.py (AGT 01-04: prompt, formatter, parser, retry, baseline)
β”‚ └── lab_manager_policy.py(AGT 05-07: feasibility, suggest, compose)
β”œβ”€β”€ scenarios/
β”‚ β”œβ”€β”€ __init__.py (re-exports from templates)
β”‚ β”œβ”€β”€ templates.py (NormalizedScenarioPack, generate_scenario, apply_difficulty)
β”‚ β”œβ”€β”€ math_reasoning.py (2 cases: Cauchy-Schwarz, Jensen's inequality)
β”‚ β”œβ”€β”€ ml_benchmark.py (2 cases: AG News TinyBERT, CIFAR-10 ResNet-18)
β”‚ └── finance_trading.py (2 cases: SPY/QQQ mean-reversion, momentum futures)
β”œβ”€β”€ scoring/
β”‚ β”œβ”€β”€ __init__.py (exports score_rigor, score_feasibility, score_fidelity)
β”‚ β”œβ”€β”€ rigor.py (JDG 01: structural quality + criteria coverage)
β”‚ β”œβ”€β”€ feasibility.py (JDG 02: wraps FeasibilityCheckResult with partial credit)
β”‚ └── fidelity.py (JDG 03: substitution-aware hidden spec alignment)
└── utils/
β”œβ”€β”€ seed.py (deterministic RNG from SHA256)
β”œβ”€β”€ text.py (shared token matching: normalize_label, element_tokens)
└── validation.py (MOD 05: protocol validation, 5 checks)
server/
└── app.py (FastAPI + WebSocket + _StubEnv)
frontend/
β”œβ”€β”€ package.json (React 19, Three.js, Framer Motion, Recharts, Tailwind)
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ App.tsx (router: /, /episode, /episode/:id)
β”‚ β”œβ”€β”€ types/index.ts (TypeScript interfaces mirroring Python models)
β”‚ β”œβ”€β”€ lib/
β”‚ β”‚ β”œβ”€β”€ api.ts (REST + WebSocket client + mock data generators)
β”‚ β”‚ β”œβ”€β”€ audio.ts (audio utilities)
β”‚ β”‚ └── utils.ts (shared helpers)
β”‚ β”œβ”€β”€ components/ (15 React components)
β”‚ └── pages/ (DashboardPage, EpisodePage)
└── vite.config.ts
tests/
β”œβ”€β”€ test_config.py (3 tests)
β”œβ”€β”€ test_models.py (15 tests)
β”œβ”€β”€ test_scenarios.py (8 tests)
β”œβ”€β”€ test_validation.py (13 tests)
β”œβ”€β”€ test_scientist_policy.py (18 tests)
β”œβ”€β”€ test_lab_manager_policy.py(13 tests)
β”œβ”€β”€ test_reward.py (18 tests β€” JDG 01-03 scoring)
└── test_server.py (5 tests β€” API endpoints)
```
## Task Completion Status
| Area | Done | Remaining | Key gaps |
|------|------|-----------|----------|
| Models (MOD) | MOD 01-05, 09, 11-12 | MOD 06 | Semantic validators for impossible plans |
| Scenarios (SCN) | SCN 01-12 | SCN 13 | Booking/scheduling data model |
| Agents (AGT) | AGT 01-07, 11 | AGT 08-10 | LLM-backed scientist, model selection |
| Judge (JDG) | JDG 01-03 | JDG 04-08 | Reward composition, bonuses, penalties |
| Environment (ENV) | β€” | ENV 01-11 | Entire real environment |
| Server (API) | API 01-04, 06 (partial) | API 05, 07-10 | Replay, auth, rate limiting |
| Frontend (FND) | FND 01-10 | β€” | Complete |
| Training (TRN) | β€” | TRN 01-18 | Entire RL pipeline |