ayushozha commited on
Commit
80aa0ec
·
1 Parent(s): 16e9144

Add model selection and architecture notes

Browse files

- Qwen3-4B as primary trainable Scientist, Qwen3-8B as H100 stretch
- Deterministic rubric remains sole training reward
- Hosted frontier evaluator for optional explanation and demo audit only
- Lab Manager stays rule-based for MVP
- Future model-backed Lab Manager added to stretch backlog and risk register
- Updated Person B docs with base-model rationale and reward notes

ReplicaLab_Comprehensive_Task_Division.md CHANGED
@@ -96,13 +96,35 @@ By judging time, the project should demonstrate:
96
  | Storytelling | everyone contributes screenshots, gifs, examples |
97
  | Submission readiness | all four review final demo, notebook, README, repo visibility |
98
 
99
- ## 4.1 Training compute availability
100
 
101
  1. The team has access to an H100 GPU for heavier Scientist training and evaluation runs.
102
  2. Person B is the primary owner of that compute for RL tasks, especially `TRN 04` to `TRN 10`, `TRN 13` to `TRN 15`, `OBS 06`, and `TST 09`.
103
  3. The judged artifact remains the Colab notebook, so any H100 run must still have a documented notebook path or reduced scale fallback that can be shown in Colab.
104
  4. Person C supports any environment URL, secret, or infra setup needed so the H100 training run can connect to the same backend contract as the notebook.
105
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
  ---
107
 
108
  ## 5. Module and function ownership map
@@ -209,12 +231,16 @@ Create a stable shared codebase, contracts, and development workflow so all work
209
 
210
  - `FND 01` status: completed on 2026-03-07
211
  - `FND 01` completed by: `Person B (Ayush)` while the assigned owner remains `Person C`
 
 
212
  - `FND 10` status: completed on 2026-03-07
213
  - `FND 10` completed by: `Person B (Ayush)` while the assigned owner remains `Person C`
214
  - Completed scope for `FND 01`: created the agreed repo scaffold for `replicalab/`, `server/`, `frontend/`, `notebooks/`, and `tests/`, including the initial `replicalab/*` and `frontend/src/*` subfolders from the planned layout
 
215
  - Completed scope for `FND 10`: created `replicalab/outputs/` with tracked `logs/`, `replays/`, and `plots/` subdirectories
216
- - Remaining work now unblocked by `FND 01`: `FND 02`, `FND 03`, `FND 04`, `FND 05`, `FND 06`, `FND 07`
217
- - Remaining Epic E01 work still gated by follow-on dependencies: `FND 08`, `FND 09`, `FND 11`, `FND 12`, `FND 13`
 
218
 
219
  ### User stories
220
 
@@ -231,7 +257,7 @@ As a team, we want agreed schemas and coding rules so integration risk stays low
231
  | FND 01 | E01.1 | Person C | repo root | Create repo structure and base folders from agreed layout | none | 0.5h | all top level folders exist and repo clones cleanly | ✅ Completed | Person B (Ayush) |
232
  | FND 02 | E01.1 | Person C | `pyproject.toml` | Add Python project config and dependencies placeholder | FND 01 | 0.5h | project installs locally without missing package errors for base modules | ⬜ Not started | — |
233
  | FND 03 | E01.1 | Person C | `frontend/package.json` | Initialize React plus Vite frontend shell | FND 01 | 0.5h | `npm install` and dev server run successfully | ⬜ Not started | — |
234
- | FND 04 | E01.2 | Person A | `replicalab/models.py` | Add empty Pydantic models and shared type names | FND 01 | 0.5h | import paths resolve for all placeholder models | Not started | |
235
  | FND 05 | E01.2 | Person C | `.gitignore` and `.dockerignore` | Add ignore rules for Python, Node, logs, notebooks, and build artifacts. `.dockerignore` must explicitly exclude `.git`, `node_modules`, `notebooks/`, `tests/`, `__pycache__`, `.venv`, and output files to keep the Docker image lean | FND 01 | 0.25h | repo status stays clean after local run and build, and Docker build excludes non-runtime files | ⬜ Not started | — |
236
  | FND 06 | E01.2 | Person D | `README.md` | Add temporary project stub with title, mission, team roles, and local setup placeholder | FND 01 | 0.5h | new contributor can understand repo purpose in under two minutes | ⬜ Not started | — |
237
  | FND 07 | E01.2 | Person C | repo settings | Define branch naming, PR template, and issue template | FND 01 | 0.5h | all future PRs auto show the template and issue fields | ⬜ Not started | — |
@@ -707,7 +733,7 @@ The MVP is complete when all of the following are true:
707
  | 2 | add judge plain English explanation panel | better judge readability |
708
  | 3 | add second and third difficulty levels to all templates | stronger world modeling story |
709
  | 4 | add curriculum training path | stronger self improvement story |
710
- | 5 | add optional LLM Lab Manager | stronger multi agent depth but higher risk |
711
  | 6 | add third agent such as ethics reviewer | potential partner fit extension |
712
  | 7 | add post episode self critique before retry | stronger self improvement story from Blueprint Section 14.2 |
713
  | 8 | add automatic scenario difficulty scaling | adaptive curriculum from Blueprint Section 14.2 |
@@ -725,6 +751,7 @@ The MVP is complete when all of the following are true:
725
  | reward too noisy or subjective | high | Person A | keep judge deterministic and rubric based |
726
  | final demo breaks live | high | all | keep replay logs and a pre tested demo seed ready |
727
  | too many scenarios | medium | Person A | ship one excellent scenario, then add more only if stable |
 
728
 
729
  ---
730
 
 
96
  | Storytelling | everyone contributes screenshots, gifs, examples |
97
  | Submission readiness | all four review final demo, notebook, README, repo visibility |
98
 
99
+ ## 4.1 Training compute and model selection
100
 
101
  1. The team has access to an H100 GPU for heavier Scientist training and evaluation runs.
102
  2. Person B is the primary owner of that compute for RL tasks, especially `TRN 04` to `TRN 10`, `TRN 13` to `TRN 15`, `OBS 06`, and `TST 09`.
103
  3. The judged artifact remains the Colab notebook, so any H100 run must still have a documented notebook path or reduced scale fallback that can be shown in Colab.
104
  4. Person C supports any environment URL, secret, or infra setup needed so the H100 training run can connect to the same backend contract as the notebook.
105
 
106
+ ### Trainable model
107
+
108
+ The primary trainable model for the Scientist policy is **Qwen3-4B**.
109
+
110
+ | Model | Role | Rationale |
111
+ | --- | --- | --- |
112
+ | Qwen3-4B | Primary Scientist policy | BF16 fits H100 (~14GB weights, ~42-56GB training). 4-bit fits Colab T4 (5.5GB). Strong structured output for JSON action schemas. Fast RL iteration speed. |
113
+ | Qwen3-8B | H100-only stretch | Better reasoning quality but 4-bit barely fits Colab T4 (6.5GB). Use only if Qwen3-4B quality is insufficient and Colab demo uses reduced-scale fallback. |
114
+
115
+ ### Evaluator layer
116
+
117
+ The training reward is always the **deterministic rubric engine** defined in E05. A hosted frontier evaluator may optionally be used for post-episode explanation and demo audit only. The frontier evaluator is never part of the training reward loop.
118
+
119
+ ### MVP role implementations
120
+
121
+ | Role | MVP implementation | Future stretch |
122
+ | --- | --- | --- |
123
+ | Scientist | Trainable policy (Qwen3-4B) | Qwen3-8B if quality insufficient |
124
+ | Lab Manager | Rule-based deterministic policy | Model-backed policy using same base model with separate adapter |
125
+ | Judge (training reward) | Deterministic rubric engine | Unchanged |
126
+ | Judge (explanation layer) | Optional hosted frontier evaluator | Extended explanation panel in UI |
127
+
128
  ---
129
 
130
  ## 5. Module and function ownership map
 
231
 
232
  - `FND 01` status: completed on 2026-03-07
233
  - `FND 01` completed by: `Person B (Ayush)` while the assigned owner remains `Person C`
234
+ - `FND 04` status: completed on 2026-03-08
235
+ - `FND 04` completed by: `Person B (Ayush)` while the assigned owner remains `Person A`
236
  - `FND 10` status: completed on 2026-03-07
237
  - `FND 10` completed by: `Person B (Ayush)` while the assigned owner remains `Person C`
238
  - Completed scope for `FND 01`: created the agreed repo scaffold for `replicalab/`, `server/`, `frontend/`, `notebooks/`, and `tests/`, including the initial `replicalab/*` and `frontend/src/*` subfolders from the planned layout
239
+ - Completed scope for `FND 04`: added importable empty Pydantic model stubs in `replicalab/models.py` for the shared action, observation, step, state, and log contracts
240
  - Completed scope for `FND 10`: created `replicalab/outputs/` with tracked `logs/`, `replays/`, and `plots/` subdirectories
241
+ - Remaining work now unblocked by `FND 01`: `FND 02`, `FND 03`, `FND 05`, `FND 06`, `FND 07`
242
+ - Newly unblocked by `FND 04`: `FND 08`, `FND 09`
243
+ - Remaining Epic E01 work still gated by follow-on dependencies: `FND 11`, `FND 12`, `FND 13`
244
 
245
  ### User stories
246
 
 
257
  | FND 01 | E01.1 | Person C | repo root | Create repo structure and base folders from agreed layout | none | 0.5h | all top level folders exist and repo clones cleanly | ✅ Completed | Person B (Ayush) |
258
  | FND 02 | E01.1 | Person C | `pyproject.toml` | Add Python project config and dependencies placeholder | FND 01 | 0.5h | project installs locally without missing package errors for base modules | ⬜ Not started | — |
259
  | FND 03 | E01.1 | Person C | `frontend/package.json` | Initialize React plus Vite frontend shell | FND 01 | 0.5h | `npm install` and dev server run successfully | ⬜ Not started | — |
260
+ | FND 04 | E01.2 | Person A | `replicalab/models.py` | Add empty Pydantic models and shared type names | FND 01 | 0.5h | import paths resolve for all placeholder models | Completed | Person B (Ayush) |
261
  | FND 05 | E01.2 | Person C | `.gitignore` and `.dockerignore` | Add ignore rules for Python, Node, logs, notebooks, and build artifacts. `.dockerignore` must explicitly exclude `.git`, `node_modules`, `notebooks/`, `tests/`, `__pycache__`, `.venv`, and output files to keep the Docker image lean | FND 01 | 0.25h | repo status stays clean after local run and build, and Docker build excludes non-runtime files | ⬜ Not started | — |
262
  | FND 06 | E01.2 | Person D | `README.md` | Add temporary project stub with title, mission, team roles, and local setup placeholder | FND 01 | 0.5h | new contributor can understand repo purpose in under two minutes | ⬜ Not started | — |
263
  | FND 07 | E01.2 | Person C | repo settings | Define branch naming, PR template, and issue template | FND 01 | 0.5h | all future PRs auto show the template and issue fields | ⬜ Not started | — |
 
733
  | 2 | add judge plain English explanation panel | better judge readability |
734
  | 3 | add second and third difficulty levels to all templates | stronger world modeling story |
735
  | 4 | add curriculum training path | stronger self improvement story |
736
+ | 5 | add model-backed Lab Manager using same base model with a separate role adapter | stronger multi agent depth but higher risk, reward stays deterministic, Lab Manager affects trajectory variance not reward definition |
737
  | 6 | add third agent such as ethics reviewer | potential partner fit extension |
738
  | 7 | add post episode self critique before retry | stronger self improvement story from Blueprint Section 14.2 |
739
  | 8 | add automatic scenario difficulty scaling | adaptive curriculum from Blueprint Section 14.2 |
 
751
  | reward too noisy or subjective | high | Person A | keep judge deterministic and rubric based |
752
  | final demo breaks live | high | all | keep replay logs and a pre tested demo seed ready |
753
  | too many scenarios | medium | Person A | ship one excellent scenario, then add more only if stable |
754
+ | future model-backed Lab Manager increases episode variance | medium | Person B | keep rule-based Lab Manager for MVP training, introduce model-backed version only after Scientist policy is stable, use same base model with separate adapter to limit infra complexity |
755
 
756
  ---
757
 
docs/ayush/task_breakdown.md CHANGED
@@ -9,8 +9,8 @@ No assumptions from other documents are used to reclassify blocked status.
9
 
10
  ## 1. Blocking Status
11
 
12
- Per the source of truth, every Person B task has at least one explicit dependency.
13
- There are zero unblocked Person B tasks at project start.
14
 
15
  ---
16
 
@@ -21,24 +21,22 @@ These tasks are first gated by upstream deliverables, primarily from Person A.
21
 
22
  | ID | Task | Depends On | Person A Deliverable | Est |
23
  |----|------|-----------|---------------------|-----|
24
- | FND 08 | Freeze JSON contract (shared A+B) | FND 04 | Empty Pydantic models | 0.75h |
25
  | MOD 09 | Build output parser for ScientistAction | MOD 01 | ScientistAction schema | 0.75h |
26
  | AGT 01 | Draft Scientist system prompt | MOD 01, SCN 11 | ScientistAction schema + generate_scenario | 0.75h |
27
  | AGT 05 | Implement feasibility checker (shared A+B) | SCN 07, MOD 05 | Constraint generator + validation | 1.25h |
28
  | SCN 11 | Create golden scenarios for prompt testing | SCN 09 | generate_scenario() | 0.75h |
29
  | JDG 10 | Expose component metrics for training plots | JDG 05, JDG 07 | Reward breakdown (A) + logging (C) | 0.5h |
30
 
31
- **Total: 6 tasks, 4.75h**
32
 
33
  ### What to ask Person A for first (priority order)
34
 
35
- 1. **FND 04** (empty Pydantic models) -- unblocks FND 08 contract freeze
36
- 2. **MOD 01** (ScientistAction schema) -- unblocks MOD 09 and, after SCN 11, AGT 01
37
- 3. **MOD 03** (Observation models) -- unblocks AGT 02
38
- 4. **SCN 09** (generate_scenario) -- unblocks SCN 11 golden scenarios
39
- 5. **SCN 07 + MOD 05** (constraints + validation) -- unblocks AGT 05, AGT 06, AGT 07
40
- 6. **JDG 05 + JDG 06** (reward breakdown + explanation) -- unblocks AGT 10 and is only part of the path for JDG 10
41
- 7. **SCN 08** (minimum viable replication spec) -- unblocks AGT 06 after AGT 05
42
 
43
  ---
44
 
@@ -118,9 +116,9 @@ are done.
118
 
119
  All phases are gated by the listed external dependency being delivered first.
120
 
121
- ### Phase 1: After Person A delivers FND 04
122
 
123
- 1. **FND 08** -- Freeze JSON contract (shared with Person A, needs FND 04)
124
 
125
  ### Phase 2: After Person A and B complete FND 08, and Person A delivers MOD 01 + SCN 09
126
 
@@ -174,7 +172,8 @@ All phases are gated by the listed external dependency being delivered first.
174
 
175
  | Category | Count | Hours |
176
  |----------|-------|-------|
177
- | Blocked by Person A (first-order) | 6 | 4.75h |
 
178
  | Blocked by Person A then Person B chain | 8 | 6.25h |
179
  | Blocked by Person C | 3 | 2.5h |
180
  | Deep training chain (internal) | 11 | 7.5h |
@@ -183,19 +182,59 @@ All phases are gated by the listed external dependency being delivered first.
183
 
184
  ---
185
 
186
- ## 9. Key Risks for Person B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
 
188
  | Risk | Impact | Mitigation |
189
  |------|--------|------------|
190
  | Person A MOD 01-03 delayed | Blocks AGT 01, MOD 09, AGT 02-04 and all downstream | Communicate priority order to Person A early |
191
  | Person C API delayed | Blocks entire training pipeline (TRN 01-15) | Coordinate with Person C on API 06 timeline |
192
- | Base model too large for Colab | Training fails or is too slow | Pick 7B or smaller, verify Colab GPU memory first |
193
  | RL training produces flat rewards | No improvement to demo | Have baseline heuristic ready, tune reward weights with Person A |
194
  | Scientist produces invalid JSON | Rollout loop crashes | AGT 03 parse plus retry is critical, build it robust |
 
195
 
196
  ---
197
 
198
- ## 10. Files Person B Owns
199
 
200
  | File | Purpose |
201
  |------|---------|
 
9
 
10
  ## 1. Blocking Status
11
 
12
+ Per the source of truth, Person B now has one unblocked task.
13
+ The immediate next task is `FND 08` because `FND 04` is complete in `replicalab/models.py`.
14
 
15
  ---
16
 
 
21
 
22
  | ID | Task | Depends On | Person A Deliverable | Est |
23
  |----|------|-----------|---------------------|-----|
 
24
  | MOD 09 | Build output parser for ScientistAction | MOD 01 | ScientistAction schema | 0.75h |
25
  | AGT 01 | Draft Scientist system prompt | MOD 01, SCN 11 | ScientistAction schema + generate_scenario | 0.75h |
26
  | AGT 05 | Implement feasibility checker (shared A+B) | SCN 07, MOD 05 | Constraint generator + validation | 1.25h |
27
  | SCN 11 | Create golden scenarios for prompt testing | SCN 09 | generate_scenario() | 0.75h |
28
  | JDG 10 | Expose component metrics for training plots | JDG 05, JDG 07 | Reward breakdown (A) + logging (C) | 0.5h |
29
 
30
+ **Total: 5 tasks, 4.0h**
31
 
32
  ### What to ask Person A for first (priority order)
33
 
34
+ 1. **MOD 01** (ScientistAction schema) -- unblocks MOD 09 and, after SCN 11, AGT 01
35
+ 2. **MOD 03** (Observation models) -- unblocks AGT 02
36
+ 3. **SCN 09** (generate_scenario) -- unblocks SCN 11 golden scenarios
37
+ 4. **SCN 07 + MOD 05** (constraints + validation) -- unblocks AGT 05, AGT 06, AGT 07
38
+ 5. **JDG 05 + JDG 06** (reward breakdown + explanation) -- unblocks AGT 10 and is only part of the path for JDG 10
39
+ 6. **SCN 08** (minimum viable replication spec) -- unblocks AGT 06 after AGT 05
 
40
 
41
  ---
42
 
 
116
 
117
  All phases are gated by the listed external dependency being delivered first.
118
 
119
+ ### Phase 1: Available now
120
 
121
+ 1. **FND 08** -- Freeze JSON contract (shared with Person A; unblocked because `FND 04` is complete)
122
 
123
  ### Phase 2: After Person A and B complete FND 08, and Person A delivers MOD 01 + SCN 09
124
 
 
172
 
173
  | Category | Count | Hours |
174
  |----------|-------|-------|
175
+ | Currently unblocked | 1 | 0.75h |
176
+ | Blocked by Person A (first-order) | 5 | 4.0h |
177
  | Blocked by Person A then Person B chain | 8 | 6.25h |
178
  | Blocked by Person C | 3 | 2.5h |
179
  | Deep training chain (internal) | 11 | 7.5h |
 
182
 
183
  ---
184
 
185
+ ## 9. Base Model Assumptions
186
+
187
+ ### Trainable Scientist policy
188
+
189
+ Primary model: **Qwen3-4B**
190
+
191
+ | Constraint | Qwen3-4B | Qwen3-8B (stretch) |
192
+ |-----------|----------|-------------------|
193
+ | H100 training (BF16, ~3-4x inference mem) | ~14GB weights, ~42-56GB total. Fits 80GB easily | ~19GB weights, ~57-76GB total. Tight |
194
+ | Colab T4 (16GB, 4-bit QLoRA) | 5.5GB. Fits comfortably | 6.5GB. Fits but less headroom |
195
+ | Structured JSON output | Good | Better |
196
+ | RL iteration speed | Fast | Slower |
197
+
198
+ Qwen3-8B is H100-only stretch. Use only if Qwen3-4B quality is insufficient and
199
+ Colab demo uses a reduced-scale fallback.
200
+
201
+ ### Reward
202
+
203
+ The training reward is always the **deterministic rubric engine** (E05 in the
204
+ source of truth). A hosted frontier evaluator may optionally be used for
205
+ post-episode explanation and demo audit. The frontier evaluator is never part
206
+ of the training reward loop.
207
+
208
+ ### Future model-backed Lab Manager
209
+
210
+ If the Lab Manager later becomes model-backed:
211
+ - The reward formula does not change. The deterministic rubric scores the final
212
+ protocol against ground truth constraints regardless of how the Lab Manager
213
+ generates its responses.
214
+ - Episode variance increases because the same seed may produce different
215
+ negotiation paths, but the scoring dimensions (rigor, feasibility, fidelity)
216
+ remain deterministic.
217
+ - The pragmatic default is same base model (Qwen3-4B) with a separate
218
+ role-specific adapter. One base model in memory, swap adapters per turn.
219
+ - Reward does not split into separate Scientist vs Lab Manager objectives.
220
+ Both roles share the same cooperative reward signal.
221
+
222
+ ---
223
+
224
+ ## 10. Key Risks for Person B
225
 
226
  | Risk | Impact | Mitigation |
227
  |------|--------|------------|
228
  | Person A MOD 01-03 delayed | Blocks AGT 01, MOD 09, AGT 02-04 and all downstream | Communicate priority order to Person A early |
229
  | Person C API delayed | Blocks entire training pipeline (TRN 01-15) | Coordinate with Person C on API 06 timeline |
230
+ | Qwen3-4B underperforms on structured output | Scientist produces low quality protocols | Fall back to Qwen3-8B on H100, use reduced-scale Colab fallback |
231
  | RL training produces flat rewards | No improvement to demo | Have baseline heuristic ready, tune reward weights with Person A |
232
  | Scientist produces invalid JSON | Rollout loop crashes | AGT 03 parse plus retry is critical, build it robust |
233
+ | Future model-backed Lab Manager increases variance | Slower RL convergence | Keep rule-based for MVP training, introduce model-backed only after Scientist policy is stable |
234
 
235
  ---
236
 
237
+ ## 11. Files Person B Owns
238
 
239
  | File | Purpose |
240
  |------|---------|
docs/ayush/task_list.md CHANGED
@@ -4,6 +4,13 @@ Source of truth: `ReplicaLab_Comprehensive_Task_Division.md`
4
 
5
  ---
6
 
 
 
 
 
 
 
 
7
  ## Epic E02. Domain Models
8
 
9
  - [ ] **MOD 09** | Add output parser that maps model text to `ScientistAction` | 0.75h | Depends: MOD 01
@@ -50,7 +57,7 @@ Source of truth: `ReplicaLab_Comprehensive_Task_Division.md`
50
  - [ ] **TRN 09** | Add policy loading path for trained adapter | 0.5h | Depends: TRN 05
51
  - [ ] **TRN 10** | Export plot image and sample logs to outputs/plots | 0.25h | Depends: TRN 07
52
  - [ ] **TRN 13** | Create reusable environment client module (client.py) | 1h | Depends: API 06
53
- - [ ] **TRN 14** | Select and document base model (notebook side) | 0.5h | Depends: TRN 01
54
  - [ ] **TRN 15** | Add agreement rate and invalid action rate aggregation | 0.5h | Depends: TRN 06, TRN 08, OBS 09
55
 
56
  ---
@@ -69,7 +76,7 @@ Source of truth: `ReplicaLab_Comprehensive_Task_Division.md`
69
 
70
  ## Shared Tasks
71
 
72
- - [ ] **FND 08** | Freeze JSON contract for actions and observations (with Person A) | 0.75h | Depends: FND 04
73
 
74
  ---
75
 
 
4
 
5
  ---
6
 
7
+ ## Current status
8
+
9
+ - `FND 04` is complete in `replicalab/models.py`
10
+ - `FND 08` is now the next unblocked Ayush task
11
+
12
+ ---
13
+
14
  ## Epic E02. Domain Models
15
 
16
  - [ ] **MOD 09** | Add output parser that maps model text to `ScientistAction` | 0.75h | Depends: MOD 01
 
57
  - [ ] **TRN 09** | Add policy loading path for trained adapter | 0.5h | Depends: TRN 05
58
  - [ ] **TRN 10** | Export plot image and sample logs to outputs/plots | 0.25h | Depends: TRN 07
59
  - [ ] **TRN 13** | Create reusable environment client module (client.py) | 1h | Depends: API 06
60
+ - [ ] **TRN 14** | Select and document base model (notebook side) | 0.5h | Depends: TRN 01 | Assumption: Qwen3-4B primary, Qwen3-8B H100-only stretch
61
  - [ ] **TRN 15** | Add agreement rate and invalid action rate aggregation | 0.5h | Depends: TRN 06, TRN 08, OBS 09
62
 
63
  ---
 
76
 
77
  ## Shared Tasks
78
 
79
+ - [ ] **FND 08** | Freeze JSON contract for actions and observations (with Person A) | 0.75h | Depends: FND 04 (done) | Status: ready now
80
 
81
  ---
82