VibecoderMcSwaggins commited on
Commit
10e320d
Β·
1 Parent(s): b1d094d

docs: fix spec half-measures (line refs, test counts, acceptance criteria)

Browse files

SPEC_01:
- Line reference: 94 β†’ 111 (actual location)
- Acceptance criteria: Mark all 4 items as done

SPEC_02:
- Test count: 140 β†’ 141
- Test Matrix: Mark both modes as IMPLEMENTED
- Acceptance criteria: Mark 4/5 items as done

Previous agent claimed "updated" but left stale values.

docs/specs/SPEC_01_DEMO_TERMINATION.md CHANGED
@@ -16,7 +16,7 @@ Advanced (Magentic) mode runs indefinitely from user perspective. The demo was m
16
  ### Question 1: Does max_round_count actually work?
17
 
18
  ```python
19
- # Current code (src/orchestrator_magentic.py:94)
20
  .with_standard_manager(
21
  chat_client=manager_client,
22
  max_round_count=self._max_rounds, # Default: 10
@@ -99,10 +99,12 @@ if len(evidence) >= 20:
99
 
100
  ## Acceptance Criteria
101
 
102
- - [ ] Demo completes in <5 minutes with visible progress
103
- - [ ] User sees round count (e.g., "Round 3/5")
104
- - [ ] Always produces SOME output (even if partial)
105
- - [ ] Timeout prevents infinite running
 
 
106
 
107
  ## Test Plan
108
 
 
16
  ### Question 1: Does max_round_count actually work?
17
 
18
  ```python
19
+ # Current code (src/orchestrator_magentic.py:111)
20
  .with_standard_manager(
21
  chat_client=manager_client,
22
  max_round_count=self._max_rounds, # Default: 10
 
99
 
100
  ## Acceptance Criteria
101
 
102
+ - [x] Demo completes in <5 minutes with visible progress
103
+ - [x] User sees round count (e.g., "Round 3/5")
104
+ - [x] Always produces SOME output (even if partial)
105
+ - [x] Timeout prevents infinite running
106
+
107
+ **Status: IMPLEMENTED** (commit b1d094d)
108
 
109
  ## Test Plan
110
 
docs/specs/SPEC_02_E2E_TESTING.md CHANGED
@@ -4,7 +4,7 @@
4
 
5
  ## Problem Statement
6
 
7
- We have 140 unit tests that verify individual components work, but **no test that proves the full pipeline produces useful research output**.
8
 
9
  We don't know if:
10
  1. Simple mode produces a valid report
@@ -115,14 +115,14 @@ async def test_real_pubmed_search():
115
 
116
  | Mode | Mock | Real API | Status |
117
  |------|------|----------|--------|
118
- | Simple (Free) | βœ… Need | ⏳ Optional | Not implemented |
119
- | Advanced (OpenAI) | βœ… Need | ⏳ Optional | Not implemented |
120
 
121
  ## Directory Structure
122
 
123
  ```
124
  tests/
125
- β”œβ”€β”€ unit/ # Existing 140 tests
126
  β”œβ”€β”€ integration/ # Real API tests (existing)
127
  └── e2e/ # NEW: Full pipeline tests
128
  β”œβ”€β”€ conftest.py # E2E fixtures
@@ -132,11 +132,13 @@ tests/
132
 
133
  ## Acceptance Criteria
134
 
135
- - [ ] E2E test for Simple mode (mocked)
136
- - [ ] E2E test for Advanced mode (mocked)
137
- - [ ] Tests validate output structure
138
- - [ ] Tests run in CI (<2 minutes)
139
- - [ ] At least one integration test with real API
 
 
140
 
141
  ## Why Before OpenAlex?
142
 
 
4
 
5
  ## Problem Statement
6
 
7
+ We have 141 unit tests that verify individual components work, but **no test that proves the full pipeline produces useful research output**.
8
 
9
  We don't know if:
10
  1. Simple mode produces a valid report
 
115
 
116
  | Mode | Mock | Real API | Status |
117
  |------|------|----------|--------|
118
+ | Simple (Free) | βœ… Done | ⏳ Optional | βœ… IMPLEMENTED |
119
+ | Advanced (OpenAI) | βœ… Done | ⏳ Optional | βœ… IMPLEMENTED |
120
 
121
  ## Directory Structure
122
 
123
  ```
124
  tests/
125
+ β”œβ”€β”€ unit/ # Existing 141 tests
126
  β”œβ”€β”€ integration/ # Real API tests (existing)
127
  └── e2e/ # NEW: Full pipeline tests
128
  β”œβ”€β”€ conftest.py # E2E fixtures
 
132
 
133
  ## Acceptance Criteria
134
 
135
+ - [x] E2E test for Simple mode (mocked)
136
+ - [x] E2E test for Advanced mode (mocked)
137
+ - [x] Tests validate output structure
138
+ - [x] Tests run in CI (<2 minutes)
139
+ - [ ] At least one integration test with real API (existing in tests/integration/)
140
+
141
+ **Status: IMPLEMENTED** (commit b1d094d)
142
 
143
  ## Why Before OpenAlex?
144