mangubee commited on
Commit
070d5c0
·
1 Parent(s): bd73133

Clear PLAN.md after Stage 1 completion

Browse files
Files changed (1) hide show
  1. PLAN.md +5 -215
PLAN.md CHANGED
@@ -1,218 +1,8 @@
1
- # Implementation Plan - Stage 1: Foundation Setup
2
 
3
- **Date:** 2026-01-01
4
- **Dev Record:** [dev/dev_260101_10_implementation_process_design.md](dev/dev_260101_10_implementation_process_design.md)
5
- **Status:** Planning
6
 
7
- ## Objective
8
 
9
- Set up infrastructure foundation for GAIA benchmark agent implementation based on Level 6 (LangGraph framework) and Level 7 (HF Spaces hosting) architectural decisions. Establish working development environment with LangGraph, configure API keys, and validate basic agent execution.
10
-
11
- ## Steps
12
-
13
- ### Step 1: Project Dependencies Setup
14
-
15
- **1.1 Create requirements.txt**
16
-
17
- - Add LangGraph core dependencies
18
- - Add LLM SDK dependencies (Anthropic, Google Generative AI, HuggingFace Inference)
19
- - Add tool dependencies (Exa SDK, requests, file parsers)
20
- - Add existing dependencies (gradio, pandas)
21
-
22
- **1.2 Install dependencies locally**
23
-
24
- - Use `uv pip install -r requirements.txt` for local testing
25
- - Verify LangGraph installation with import test
26
-
27
- ### Step 2: Environment Configuration
28
-
29
- **2.1 Create .env.example template**
30
-
31
- - Document required API keys (ANTHROPIC_API_KEY, GOOGLE_API_KEY, EXA_API_KEY, etc.)
32
- - Add GAIA API configuration (DEFAULT_API_URL, SPACE_ID)
33
-
34
- **2.2 Configure HF Secrets (production)**
35
-
36
- - Set ANTHROPIC_API_KEY in HF Space settings
37
- - Set GOOGLE_API_KEY for Gemini Flash baseline
38
- - Set EXA_API_KEY for web search tool
39
- - Verify Space can access environment variables
40
-
41
- ### Step 3: Project Structure Creation
42
-
43
- **3.1 Create module directories**
44
-
45
- ```
46
- 16_HuggingFace/Final_Assignment_Template/
47
- ├── src/
48
- │ ├── agent/ # LangGraph agent core
49
- │ │ ├── __init__.py
50
- │ │ └── graph.py # StateGraph definition
51
- │ ├── tools/ # MCP tool implementations
52
- │ │ ├── __init__.py
53
- │ │ ├── web_search.py
54
- │ │ ├── code_interpreter.py
55
- │ │ ├── file_reader.py
56
- │ │ └── multimodal.py
57
- │ ├── config/ # Configuration management
58
- │ │ ├── __init__.py
59
- │ │ └── settings.py
60
- │ └── __init__.py
61
- ├── tests/ # Test files
62
- │ └── test_agent_basic.py
63
- ├── app.py # Gradio interface (existing)
64
- ├── requirements.txt # Dependencies
65
- └── .env.example # Environment template
66
- ```
67
-
68
- **3.2 Create __init__.py files**
69
-
70
- - Enable proper Python module imports
71
-
72
- ### Step 4: LangGraph Agent Skeleton
73
-
74
- **4.1 Create src/config/settings.py**
75
-
76
- - Load environment variables
77
- - Define configuration constants (API URLs, timeouts, retry settings)
78
- - LLM model selection logic (Gemini Flash as default, Claude as fallback)
79
-
80
- **4.2 Create src/agent/graph.py**
81
-
82
- - Define AgentState TypedDict (question, plan, tool_calls, answer, errors)
83
- - Create empty StateGraph with placeholder nodes:
84
- - `plan_node`: Placeholder for planning logic
85
- - `execute_node`: Placeholder for tool execution
86
- - `answer_node`: Placeholder for answer synthesis
87
- - Define graph edges (plan → execute → answer)
88
- - Compile graph
89
-
90
- **4.3 Create basic agent wrapper**
91
-
92
- - GAIAAgent class that wraps compiled graph
93
- - `__call__(self, question: str) -> str` method
94
- - Invoke graph with question input
95
- - Return final answer from state
96
-
97
- ### Step 5: Integration with Existing app.py
98
-
99
- **5.1 Modify app.py**
100
-
101
- - Replace BasicAgent import with GAIAAgent
102
- - Update agent instantiation in `run_and_submit_all`
103
- - Keep existing Gradio UI and API integration unchanged
104
- - Add error handling for agent initialization
105
-
106
- **5.2 Add logging configuration**
107
-
108
- - Configure Python logging module
109
- - Log agent initialization, graph compilation, question processing
110
- - Maintain existing print statements for Gradio UI
111
-
112
- ### Step 6: Validation & Testing
113
-
114
- **6.1 Create tests/test_agent_basic.py**
115
-
116
- - Test LangGraph agent initialization
117
- - Test agent with dummy question (should return placeholder answer)
118
- - Verify StateGraph compilation succeeds
119
-
120
- **6.2 Local testing**
121
-
122
- - Run `uv run python tests/test_agent_basic.py`
123
- - Run Gradio app locally: `uv run python app.py`
124
- - Test question submission (expect placeholder answer, not error)
125
-
126
- **6.3 HF Space deployment validation**
127
-
128
- - Push changes to HF Space repository
129
- - Verify Space builds successfully
130
- - Test Gradio interface with OAuth login
131
- - Submit test question to API (expect placeholder answer)
132
-
133
- ## Files to Modify
134
-
135
- **New files to create:**
136
-
137
- - `requirements.txt` - Project dependencies
138
- - `.env.example` - Environment variable template
139
- - `src/__init__.py` - Package initialization
140
- - `src/config/__init__.py` - Config package
141
- - `src/config/settings.py` - Configuration management
142
- - `src/agent/__init__.py` - Agent package
143
- - `src/agent/graph.py` - LangGraph StateGraph definition
144
- - `src/tools/__init__.py` - Tools package (placeholder)
145
- - `tests/test_agent_basic.py` - Basic validation tests
146
-
147
- **Existing files to modify:**
148
-
149
- - `app.py` - Replace BasicAgent with GAIAAgent
150
-
151
- **Files NOT to modify yet:**
152
-
153
- - `README.md` - No changes until Stage 1 complete
154
- - Tool implementations - Defer to Stage 2
155
- - Planning/execution logic - Defer to Stage 3
156
-
157
- ## Success Criteria
158
-
159
- ### Functional Requirements
160
-
161
- - [ ] LangGraph agent compiles without errors
162
- - [ ] Agent accepts question input and returns answer (placeholder OK)
163
- - [ ] Gradio UI works with new agent integration
164
- - [ ] HF Space deploys successfully with new dependencies
165
- - [ ] Environment variables load correctly (API keys accessible)
166
-
167
- ### Technical Requirements
168
-
169
- - [ ] All dependencies install without conflicts
170
- - [ ] Python module imports work correctly
171
- - [ ] StateGraph structure defined with 3 nodes (plan, execute, answer)
172
- - [ ] No runtime errors during agent initialization
173
- - [ ] Test suite passes locally
174
-
175
- ### Validation Checkpoints
176
-
177
- - [ ] **Checkpoint 1:** requirements.txt created and dependencies install locally
178
- - [ ] **Checkpoint 2:** Project structure created, all __init__.py files present
179
- - [ ] **Checkpoint 3:** LangGraph StateGraph compiles successfully
180
- - [ ] **Checkpoint 4:** GAIAAgent returns placeholder answer for test question
181
- - [ ] **Checkpoint 5:** Gradio UI works locally with new agent
182
- - [ ] **Checkpoint 6:** HF Space deploys and runs without errors
183
-
184
- ### Non-Goals for Stage 1
185
-
186
- - ❌ Implementing actual planning logic (Stage 3)
187
- - ❌ Implementing tool integrations (Stage 2)
188
- - ❌ Implementing error handling/retry logic (Stage 4)
189
- - ❌ Performance optimization (Stage 5)
190
- - ❌ Achieving any GAIA accuracy targets (Stage 5)
191
-
192
- ## Dependencies & Risks
193
-
194
- **Dependencies:**
195
-
196
- - HuggingFace Space deployment access
197
- - API keys for external services (Anthropic, Google, Exa)
198
- - LangGraph package availability
199
-
200
- **Risks:**
201
-
202
- - **Risk:** LangGraph version conflicts with existing dependencies
203
- - **Mitigation:** Test locally first, pin versions in requirements.txt
204
- - **Risk:** HF Space build fails with new dependencies
205
- - **Mitigation:** Incremental deployment, test each dependency addition
206
- - **Risk:** API key configuration issues in HF Secrets
207
- - **Mitigation:** Create .env.example with clear documentation
208
-
209
- **Estimated Time:** 1-2 days
210
-
211
- ## Next Steps After Stage 1
212
-
213
- Once Stage 1 Success Criteria met:
214
-
215
- 1. Create Stage 2 plan (Tool Development)
216
- 2. Implement 4 core tools as MCP servers
217
- 3. Test each tool independently
218
- 4. Proceed to Stage 3 (Agent Core)
 
1
+ # Implementation Plan
2
 
3
+ **Status:** Ready for next stage
4
+ **Last Updated:** 2026-01-02
 
5
 
6
+ ---
7
 
8
+ Stage 1 completed. Planning for next stage will be documented here.