# FND 08 Frozen JSON Contract Status: completed on 2026-03-08 Owners: Person A and Person B Drafted by: Person B (Ayush) Remaining acceptance item: none Source schema file: `replicalab/models.py` ## Purpose This document freezes the JSON contract for the shared ReplicaLab data models so downstream work can proceed without schema drift. It is the reference for: - Person A validators and environment state handling - Person B prompt formatting and action parsing - Person C API payload examples - Person D frontend and replay mocks ## Tool-Capability Addendum The richer-capability MVP adds bounded search, code-check, and image-inspection support below this frozen contract. This addendum does **not** reopen the outward action schema from `FND 08`. The final outward actions remain `ScientistAction` and `LabManagerAction`. Bounded tool use will be represented through scenario or evidence metadata, environment-side tool traces, and `StepResult.info` or replay payloads rather than new outward action types for the MVP. ## Global conventions - All JSON keys use `snake_case`. - Enum-like values use lowercase snake_case strings. - All top-level keys listed in this document must be present unless explicitly marked nullable. - Use `null` for an absent single object. - Use `[]` for a known empty collection. - Use `{}` only for flexible metadata objects such as `info` and `reward_breakdown`. - `round_number` is zero-based. `0` is the state immediately after `reset()`. - `duration_days` and `time_limit_days` are whole calendar days. - `difficulty` values are `easy`, `medium`, or `hard`. - Component scores such as rigor, feasibility, and fidelity are floats in the inclusive range `0.0` to `1.0`. ## Shared nested objects ### ConversationEntry Each item in `conversation_history` or `transcript` must use this shape: | Field | Type | Required | Notes | | --- | --- | --- | --- | | `role` | `str` | yes | One of `scientist`, `lab_manager`, `system` | | `message` | `str` | yes | Human-readable turn text | | `round_number` | `int` | yes | Zero-based round index for the message | | `action_type` | `str \| null` | yes | Mirrors the action type when the message comes from an agent, otherwise `null` | ### Protocol When `current_protocol` is not `null`, it must use this shape: | Field | Type | Required | Notes | | --- | --- | --- | --- | | `sample_size` | `int` | yes | Non-negative integer | | `controls` | `list[str]` | yes | Empty list when no controls are specified yet | | `technique` | `str` | yes | Proposed experimental technique | | `duration_days` | `int` | yes | Whole calendar days | | `required_equipment` | `list[str]` | yes | Empty list when none is needed | | `required_reagents` | `list[str]` | yes | Empty list when none is needed | | `rationale` | `str` | yes | Short explanation for the protocol | ### RewardBreakdown When `reward_breakdown` is present, it must use this shape: | Field | Type | Required | Notes | | --- | --- | --- | --- | | `rigor` | `float` | yes | Component score in `0.0` to `1.0` | | `feasibility` | `float` | yes | Component score in `0.0` to `1.0` | | `fidelity` | `float` | yes | Component score in `0.0` to `1.0` | | `efficiency_bonus` | `float` | yes | Bonus term, `0.0` if unused | | `communication_bonus` | `float` | yes | Bonus term, `0.0` if unused | | `penalties` | `dict[str, float]` | yes | Per-penalty values keyed by penalty name | ## Model contracts ### ScientistAction Action types: - `propose_protocol` - `revise_protocol` - `request_info` - `accept` Field contract: | Field | Type | Required | Notes | | --- | --- | --- | --- | | `action_type` | `str` | yes | Must be one of the values above | | `sample_size` | `int` | yes | Meaningful for `propose_protocol` and `revise_protocol`, otherwise `0` | | `controls` | `list[str]` | yes | Meaningful for `propose_protocol` and `revise_protocol`, otherwise `[]` | | `technique` | `str` | yes | Meaningful for `propose_protocol` and `revise_protocol`, otherwise `""` | | `duration_days` | `int` | yes | Meaningful for `propose_protocol` and `revise_protocol`, otherwise `0` | | `required_equipment` | `list[str]` | yes | Meaningful for `propose_protocol` and `revise_protocol`, otherwise `[]` | | `required_reagents` | `list[str]` | yes | Meaningful for `propose_protocol` and `revise_protocol`, otherwise `[]` | | `questions` | `list[str]` | yes | Meaningful for `request_info`, otherwise `[]` | | `rationale` | `str` | yes | Required free-text explanation for protocol proposals and revisions; `""` for `accept` | Canonical example: ```json { "action_type": "propose_protocol", "sample_size": 48, "controls": ["vehicle_control", "positive_control"], "technique": "wst1_assay", "duration_days": 5, "required_equipment": ["plate_reader", "co2_incubator"], "required_reagents": ["wst1", "dmso", "drug_x"], "questions": [], "rationale": "Keeps the core readout while using equipment commonly available in teaching labs." } ``` ### LabManagerAction Action types: - `report_feasibility` - `suggest_alternative` - `reject` - `accept` Field contract: | Field | Type | Required | Notes | | --- | --- | --- | --- | | `action_type` | `str` | yes | Must be one of the values above | | `feasible` | `bool` | yes | Overall summary flag equal to the logical AND of the constraint dimension flags | | `budget_ok` | `bool` | yes | Whether the proposed protocol fits remaining budget | | `equipment_ok` | `bool` | yes | Whether required equipment is available in time | | `reagents_ok` | `bool` | yes | Whether required reagents are available | | `schedule_ok` | `bool` | yes | Whether the protocol fits the allowed timeline | | `staff_ok` | `bool` | yes | Whether staffing is sufficient | | `suggested_technique` | `str` | yes | Meaningful for `suggest_alternative`, otherwise `""` | | `suggested_sample_size` | `int` | yes | Meaningful for `suggest_alternative`, otherwise `0` | | `suggested_controls` | `list[str]` | yes | Meaningful for `suggest_alternative`, otherwise `[]` | | `explanation` | `str` | yes | Human-readable explanation of the constraint outcome | Conditional rules: - `action_type = accept` implies `feasible = true` and all constraint flags are `true`. - `action_type = reject` implies `feasible = false` and at least one constraint flag is `false`. - `action_type = suggest_alternative` implies `feasible = false` and at least one of the suggestion fields carries a non-default value. Canonical example: ```json { "action_type": "suggest_alternative", "feasible": false, "budget_ok": true, "equipment_ok": false, "reagents_ok": true, "schedule_ok": true, "staff_ok": true, "suggested_technique": "manual_cell_counting", "suggested_sample_size": 32, "suggested_controls": ["vehicle_control", "positive_control"], "explanation": "The plate reader is fully booked, so use manual counting and reduce the sample size to stay within the timeline." } ``` ### ScientistObservation | Field | Type | Required | Notes | | --- | --- | --- | --- | | `paper_title` | `str` | yes | Study title | | `paper_hypothesis` | `str` | yes | Core hypothesis being replicated | | `paper_method` | `str` | yes | Short method summary | | `paper_key_finding` | `str` | yes | Main finding being targeted | | `experiment_goal` | `str` | yes | What the scientist is trying to preserve | | `conversation_history` | `list[ConversationEntry]` | yes | Empty list at reset | | `current_protocol` | `Protocol \| null` | yes | `null` until a protocol exists | | `round_number` | `int` | yes | Zero-based current round | | `max_rounds` | `int` | yes | Max allowed rounds in the episode | Canonical example: ```json { "paper_title": "Drug X reduces glioblastoma cell viability", "paper_hypothesis": "Drug X reduces viability in a dose-dependent manner.", "paper_method": "96-well viability assay with 24h incubation and absorbance readout.", "paper_key_finding": "The highest dose reduced viability by about 40 percent.", "experiment_goal": "Replicate the dose-response trend without dropping essential controls.", "conversation_history": [], "current_protocol": null, "round_number": 0, "max_rounds": 6 } ``` ### LabManagerObservation | Field | Type | Required | Notes | | --- | --- | --- | --- | | `budget_total` | `float` | yes | Initial budget for the episode | | `budget_remaining` | `float` | yes | Current remaining budget | | `equipment_available` | `list[str]` | yes | Equipment that can be used | | `equipment_booked` | `list[str]` | yes | Equipment unavailable due to booking | | `reagents_in_stock` | `list[str]` | yes | Available reagents | | `reagents_out_of_stock` | `list[str]` | yes | Required but unavailable reagents | | `staff_count` | `int` | yes | Available staff count | | `time_limit_days` | `int` | yes | Whole calendar days remaining | | `safety_restrictions` | `list[str]` | yes | Constraints such as banned solvents or assay restrictions | | `conversation_history` | `list[ConversationEntry]` | yes | Empty list at reset | | `current_protocol` | `Protocol \| null` | yes | `null` until a protocol exists | | `round_number` | `int` | yes | Zero-based current round | | `max_rounds` | `int` | yes | Max allowed rounds in the episode | Canonical example: ```json { "budget_total": 1200.0, "budget_remaining": 1200.0, "equipment_available": ["co2_incubator", "microscope"], "equipment_booked": ["plate_reader"], "reagents_in_stock": ["dmso", "drug_x", "culture_media"], "reagents_out_of_stock": ["wst1"], "staff_count": 2, "time_limit_days": 7, "safety_restrictions": ["no_radioactive_reagents"], "conversation_history": [], "current_protocol": null, "round_number": 0, "max_rounds": 6 } ``` ### Observation Wrapper behavior: - Serialized `Observation` objects always include both top-level keys: `scientist` and `lab_manager`. - In shared environment state, replay, and API payloads, both branches should normally be populated. - When a consumer is intentionally given only one role view, the non-owned branch must be `null`, not omitted. | Field | Type | Required | Notes | | --- | --- | --- | --- | | `scientist` | `ScientistObservation \| null` | yes | Scientist-side view | | `lab_manager` | `LabManagerObservation \| null` | yes | Lab-manager-side view | Canonical example: ```json { "scientist": { "paper_title": "Drug X reduces glioblastoma cell viability", "paper_hypothesis": "Drug X reduces viability in a dose-dependent manner.", "paper_method": "96-well viability assay with 24h incubation and absorbance readout.", "paper_key_finding": "The highest dose reduced viability by about 40 percent.", "experiment_goal": "Replicate the dose-response trend without dropping essential controls.", "conversation_history": [], "current_protocol": null, "round_number": 0, "max_rounds": 6 }, "lab_manager": { "budget_total": 1200.0, "budget_remaining": 1200.0, "equipment_available": ["co2_incubator", "microscope"], "equipment_booked": ["plate_reader"], "reagents_in_stock": ["dmso", "drug_x", "culture_media"], "reagents_out_of_stock": ["wst1"], "staff_count": 2, "time_limit_days": 7, "safety_restrictions": ["no_radioactive_reagents"], "conversation_history": [], "current_protocol": null, "round_number": 0, "max_rounds": 6 } } ``` ### StepResult | Field | Type | Required | Notes | | --- | --- | --- | --- | | `observation` | `Observation \| null` | yes | Present on normal steps and terminal steps; may be `null` only on hard failure | | `reward` | `float` | yes | Episode reward after the step; terminal reward on final step | | `done` | `bool` | yes | Whether the episode is terminal | | `info` | `dict` | yes | Flexible metadata object | Reserved `info` keys: - `agreement_reached`: `bool` - `error`: `str | null` - `reward_breakdown`: `RewardBreakdown | null` - `judge_notes`: `str | null` - `verdict`: `str | null` Canonical example: ```json { "observation": { "scientist": null, "lab_manager": null }, "reward": 6.72, "done": true, "info": { "agreement_reached": true, "error": null, "reward_breakdown": { "rigor": 0.9, "feasibility": 0.8, "fidelity": 0.85, "efficiency_bonus": 0.25, "communication_bonus": 0.15, "penalties": { "invalid_action": 0.0, "timeout": 0.0 } }, "judge_notes": "Controls were preserved and the substitutions remained scientifically acceptable.", "verdict": "accept" } } ``` ### EpisodeState | Field | Type | Required | Notes | | --- | --- | --- | --- | | `seed` | `int` | yes | Deterministic episode seed | | `scenario_template` | `str` | yes | Scenario family identifier | | `difficulty` | `str` | yes | `easy`, `medium`, or `hard` | | `paper_title` | `str` | yes | Study title | | `paper_hypothesis` | `str` | yes | Core hypothesis | | `paper_method` | `str` | yes | Method summary | | `paper_key_finding` | `str` | yes | Main finding | | `experiment_goal` | `str` | yes | Goal preserved through negotiation | | `lab_budget_total` | `float` | yes | Initial budget | | `lab_budget_remaining` | `float` | yes | Remaining budget | | `lab_equipment` | `list[str]` | yes | Equipment state | | `lab_reagents` | `list[str]` | yes | Reagent state | | `lab_staff_count` | `int` | yes | Available staff count | | `lab_time_limit_days` | `int` | yes | Whole calendar days remaining | | `current_protocol` | `Protocol \| null` | yes | Current agreed or latest proposed protocol | | `conversation_history` | `list[ConversationEntry]` | yes | Negotiation history | | `round_number` | `int` | yes | Zero-based round counter | | `max_rounds` | `int` | yes | Maximum rounds allowed | | `done` | `bool` | yes | Terminal flag | | `agreement_reached` | `bool` | yes | Whether both sides reached agreement | | `reward` | `float` | yes | Final total reward or `0.0` until terminal scoring | | `rigor_score` | `float` | yes | Final component score or `0.0` until terminal scoring | | `feasibility_score` | `float` | yes | Final component score or `0.0` until terminal scoring | | `fidelity_score` | `float` | yes | Final component score or `0.0` until terminal scoring | Canonical example: ```json { "seed": 17, "scenario_template": "cell_biology", "difficulty": "medium", "paper_title": "Drug X reduces glioblastoma cell viability", "paper_hypothesis": "Drug X reduces viability in a dose-dependent manner.", "paper_method": "96-well viability assay with 24h incubation and absorbance readout.", "paper_key_finding": "The highest dose reduced viability by about 40 percent.", "experiment_goal": "Replicate the dose-response trend without dropping essential controls.", "lab_budget_total": 1200.0, "lab_budget_remaining": 850.0, "lab_equipment": ["co2_incubator", "microscope"], "lab_reagents": ["dmso", "drug_x", "culture_media"], "lab_staff_count": 2, "lab_time_limit_days": 7, "current_protocol": { "sample_size": 32, "controls": ["vehicle_control", "positive_control"], "technique": "manual_cell_counting", "duration_days": 5, "required_equipment": ["microscope", "co2_incubator"], "required_reagents": ["dmso", "drug_x", "culture_media"], "rationale": "Uses available equipment while preserving control structure." }, "conversation_history": [ { "role": "scientist", "message": "I propose a manual counting protocol that keeps both controls.", "round_number": 0, "action_type": "propose_protocol" } ], "round_number": 1, "max_rounds": 6, "done": false, "agreement_reached": false, "reward": 0.0, "rigor_score": 0.0, "feasibility_score": 0.0, "fidelity_score": 0.0 } ``` ### EpisodeLog | Field | Type | Required | Notes | | --- | --- | --- | --- | | `episode_id` | `str` | yes | Stable replay identifier | | `seed` | `int` | yes | Episode seed | | `scenario_template` | `str` | yes | Scenario family identifier | | `difficulty` | `str` | yes | `easy`, `medium`, or `hard` | | `final_state` | `EpisodeState \| null` | yes | Must be populated for completed episodes | | `transcript` | `list[ConversationEntry]` | yes | Replayable transcript | | `reward_breakdown` | `RewardBreakdown` | yes | Final reward components | | `total_reward` | `float` | yes | Final total reward | | `rounds_used` | `int` | yes | Number of completed rounds | | `agreement_reached` | `bool` | yes | Final agreement flag | | `judge_notes` | `str` | yes | Human-readable audit summary | | `verdict` | `str` | yes | One of `accept`, `revise`, `reject` | Canonical example: ```json { "episode_id": "cell_biology-17-medium-0001", "seed": 17, "scenario_template": "cell_biology", "difficulty": "medium", "final_state": { "seed": 17, "scenario_template": "cell_biology", "difficulty": "medium", "paper_title": "Drug X reduces glioblastoma cell viability", "paper_hypothesis": "Drug X reduces viability in a dose-dependent manner.", "paper_method": "96-well viability assay with 24h incubation and absorbance readout.", "paper_key_finding": "The highest dose reduced viability by about 40 percent.", "experiment_goal": "Replicate the dose-response trend without dropping essential controls.", "lab_budget_total": 1200.0, "lab_budget_remaining": 850.0, "lab_equipment": ["co2_incubator", "microscope"], "lab_reagents": ["dmso", "drug_x", "culture_media"], "lab_staff_count": 2, "lab_time_limit_days": 7, "current_protocol": { "sample_size": 32, "controls": ["vehicle_control", "positive_control"], "technique": "manual_cell_counting", "duration_days": 5, "required_equipment": ["microscope", "co2_incubator"], "required_reagents": ["dmso", "drug_x", "culture_media"], "rationale": "Uses available equipment while preserving control structure." }, "conversation_history": [ { "role": "scientist", "message": "I propose a manual counting protocol that keeps both controls.", "round_number": 0, "action_type": "propose_protocol" }, { "role": "lab_manager", "message": "This alternative is feasible with current equipment and budget.", "round_number": 0, "action_type": "accept" } ], "round_number": 1, "max_rounds": 6, "done": true, "agreement_reached": true, "reward": 6.72, "rigor_score": 0.9, "feasibility_score": 0.8, "fidelity_score": 0.85 }, "transcript": [ { "role": "scientist", "message": "I propose a manual counting protocol that keeps both controls.", "round_number": 0, "action_type": "propose_protocol" }, { "role": "lab_manager", "message": "This alternative is feasible with current equipment and budget.", "round_number": 0, "action_type": "accept" } ], "reward_breakdown": { "rigor": 0.9, "feasibility": 0.8, "fidelity": 0.85, "efficiency_bonus": 0.25, "communication_bonus": 0.15, "penalties": { "invalid_action": 0.0, "timeout": 0.0 } }, "total_reward": 6.72, "rounds_used": 1, "agreement_reached": true, "judge_notes": "Controls were preserved and the substitutions remained scientifically acceptable.", "verdict": "accept" } ``` ## Sign-off | Owner | Status | Notes | | --- | --- | --- | | Person B (Ayush) | signed off | Draft matches current stubs and downstream parser needs | | Kian (Person A) | signed off | Validator and environment-owner review completed; contract is frozen for `MOD 01`, `MOD 03`, `FND 09`, and downstream parser work |