Move test files from tests/ to test/ folder per project standard
Browse files- Moved all test files to standard test/ folder (singular, not plural)
- test_web_search.py, test_file_parser.py, test_calculator.py, test_vision.py
- Moved fixtures/ directory to test/fixtures/
- Removed tests/ directory
- All 91 tests still passing
- PLAN.md +8 -287
- TODO.md +9 -66
- dev/dev_260102_13_stage2_tool_development.md +280 -0
- src/tools/web_search.py +20 -11
- {tests β test}/README.md +0 -0
- {tests β test}/__init__.py +0 -0
- {tests β test}/fixtures/generate_fixtures.py +0 -0
- {tests β test}/fixtures/sample.csv +0 -0
- {tests β test}/fixtures/sample.docx +0 -0
- {tests β test}/fixtures/sample.txt +0 -0
- {tests β test}/fixtures/sample.xlsx +0 -0
- {tests β test}/fixtures/test_image.jpg +0 -0
- {tests β test}/test_agent_basic.py +0 -0
- {tests β test}/test_calculator.py +0 -0
- {tests β test}/test_file_parser.py +0 -0
- {tests β test}/test_stage1.py +0 -0
- {tests β test}/test_vision.py +0 -0
- {tests β test}/test_web_search.py +0 -0
PLAN.md
CHANGED
|
@@ -1,300 +1,21 @@
|
|
| 1 |
-
# Implementation Plan
|
| 2 |
|
| 3 |
-
**Date:**
|
| 4 |
-
**Dev Record:**
|
| 5 |
-
**Status:** In Progress
|
| 6 |
|
| 7 |
## Objective
|
| 8 |
|
| 9 |
-
|
| 10 |
|
| 11 |
## Steps
|
| 12 |
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
**1.1 Create src/tools/web_search.py**
|
| 16 |
-
|
| 17 |
-
- Implement `tavily_search(query: str, max_results: int = 5) -> dict` function
|
| 18 |
-
- Implement `exa_search(query: str, max_results: int = 5) -> dict` function (fallback)
|
| 19 |
-
- Use Settings.get_search_api_key() for API key retrieval
|
| 20 |
-
- Return structured results: {results: [{title, url, snippet}], source: "tavily"|"exa"}
|
| 21 |
-
|
| 22 |
-
**1.2 Add retry logic with exponential backoff**
|
| 23 |
-
|
| 24 |
-
- Use `tenacity` library for retry decorator
|
| 25 |
-
- Retry on connection errors, timeouts, rate limits
|
| 26 |
-
- Max 3 retries with 2^n second delays
|
| 27 |
-
- Fallback from Tavily to Exa if Tavily fails after retries
|
| 28 |
-
|
| 29 |
-
**1.3 Error handling**
|
| 30 |
-
|
| 31 |
-
- Catch API errors and return meaningful error messages
|
| 32 |
-
- Handle empty results gracefully
|
| 33 |
-
- Log all errors for debugging
|
| 34 |
-
|
| 35 |
-
**1.4 Create tests/test_web_search.py**
|
| 36 |
-
|
| 37 |
-
- Test Tavily search with mock API
|
| 38 |
-
- Test Exa search with mock API
|
| 39 |
-
- Test retry logic (simulate failures)
|
| 40 |
-
- Test fallback mechanism
|
| 41 |
-
- Test error handling
|
| 42 |
-
|
| 43 |
-
### Step 2: File Parsing Tool Implementation
|
| 44 |
-
|
| 45 |
-
**2.1 Create src/tools/file_parser.py**
|
| 46 |
-
|
| 47 |
-
- Implement `parse_pdf(file_path: str) -> str` using PyPDF2
|
| 48 |
-
- Implement `parse_excel(file_path: str) -> dict` using openpyxl
|
| 49 |
-
- Implement `parse_docx(file_path: str) -> str` using python-docx
|
| 50 |
-
- Implement `parse_image_text(image_path: str) -> str` using Pillow + OCR (optional)
|
| 51 |
-
- Generic `parse_file(file_path: str) -> dict` dispatcher based on extension
|
| 52 |
-
|
| 53 |
-
**2.2 Add retry logic for file operations**
|
| 54 |
-
|
| 55 |
-
- Retry on file read errors (network issues, temporary locks)
|
| 56 |
-
- Max 3 retries with exponential backoff
|
| 57 |
-
|
| 58 |
-
**2.3 Error handling**
|
| 59 |
-
|
| 60 |
-
- Handle file not found errors
|
| 61 |
-
- Handle corrupted file errors
|
| 62 |
-
- Handle unsupported format errors
|
| 63 |
-
- Return structured error responses
|
| 64 |
-
|
| 65 |
-
**2.4 Create tests/test_file_parser.py**
|
| 66 |
-
|
| 67 |
-
- Create test fixtures (sample PDF, Excel, Word files in tests/fixtures/)
|
| 68 |
-
- Test each parser function independently
|
| 69 |
-
- Test error handling for missing files
|
| 70 |
-
- Test error handling for corrupted files
|
| 71 |
-
|
| 72 |
-
### Step 3: Calculator Tool Implementation
|
| 73 |
-
|
| 74 |
-
**3.1 Create src/tools/calculator.py**
|
| 75 |
-
|
| 76 |
-
- Implement `safe_eval(expression: str) -> dict` using ast.literal_eval
|
| 77 |
-
- Support basic arithmetic operations (+, -, *, /, **, %)
|
| 78 |
-
- Support mathematical functions (sin, cos, sqrt, etc.) via math module
|
| 79 |
-
- Return structured result: {result: float|int, expression: str}
|
| 80 |
-
|
| 81 |
-
**3.2 Add safety checks**
|
| 82 |
-
|
| 83 |
-
- Whitelist allowed operations (no exec, eval, import)
|
| 84 |
-
- Validate expression before evaluation
|
| 85 |
-
- Set execution timeout (prevent infinite loops)
|
| 86 |
-
- Limit expression complexity (prevent DoS)
|
| 87 |
-
|
| 88 |
-
**3.3 Error handling**
|
| 89 |
-
|
| 90 |
-
- Handle syntax errors
|
| 91 |
-
- Handle division by zero
|
| 92 |
-
- Handle invalid operations
|
| 93 |
-
- Return meaningful error messages
|
| 94 |
-
|
| 95 |
-
**3.4 Create tests/test_calculator.py**
|
| 96 |
-
|
| 97 |
-
- Test basic arithmetic (2+2, 10*5, etc.)
|
| 98 |
-
- Test mathematical functions (sqrt(16), sin(0), etc.)
|
| 99 |
-
- Test error handling (division by zero, invalid syntax)
|
| 100 |
-
- Test safety checks (block dangerous operations)
|
| 101 |
-
|
| 102 |
-
### Step 4: Multimodal Vision Tool Implementation
|
| 103 |
-
|
| 104 |
-
**4.1 Create src/tools/vision.py**
|
| 105 |
-
|
| 106 |
-
- Implement `analyze_image(image_path: str, question: str) -> str`
|
| 107 |
-
- Use LLM's native vision capabilities (Gemini/Claude)
|
| 108 |
-
- Load image, encode to base64
|
| 109 |
-
- Send to vision-capable LLM with question
|
| 110 |
-
- Return description/answer
|
| 111 |
-
|
| 112 |
-
**4.2 Add retry logic**
|
| 113 |
-
|
| 114 |
-
- Retry on API errors
|
| 115 |
-
- Max 3 retries with exponential backoff
|
| 116 |
-
|
| 117 |
-
**4.3 Error handling**
|
| 118 |
-
|
| 119 |
-
- Handle image loading errors
|
| 120 |
-
- Handle unsupported image formats
|
| 121 |
-
- Handle API errors
|
| 122 |
-
- Return structured responses
|
| 123 |
-
|
| 124 |
-
**4.4 Create tests/test_vision.py**
|
| 125 |
-
|
| 126 |
-
- Create test image fixtures
|
| 127 |
-
- Test image analysis with mock LLM
|
| 128 |
-
- Test error handling
|
| 129 |
-
- Test retry logic
|
| 130 |
-
|
| 131 |
-
### Step 5: Tool Integration with StateGraph
|
| 132 |
-
|
| 133 |
-
**5.1 Update src/tools/__init__.py**
|
| 134 |
-
|
| 135 |
-
- Export all tool functions
|
| 136 |
-
- Create unified tool registry: `TOOLS = {name: function}`
|
| 137 |
-
- Add tool metadata (description, parameters, return type)
|
| 138 |
-
|
| 139 |
-
**5.2 Update src/agent/graph.py execute_node**
|
| 140 |
-
|
| 141 |
-
- Replace placeholder with actual tool execution
|
| 142 |
-
- Parse tool calls from plan
|
| 143 |
-
- Execute tools with error handling
|
| 144 |
-
- Collect results
|
| 145 |
-
- Return updated state with tool results
|
| 146 |
-
|
| 147 |
-
**5.3 Add tool execution wrapper**
|
| 148 |
-
|
| 149 |
-
- Implement `execute_tool(tool_name: str, **kwargs) -> dict`
|
| 150 |
-
- Add logging for tool calls
|
| 151 |
-
- Add timeout enforcement
|
| 152 |
-
- Add result validation
|
| 153 |
-
|
| 154 |
-
### Step 6: Configuration and Settings Updates
|
| 155 |
-
|
| 156 |
-
**6.1 Update src/config/settings.py**
|
| 157 |
-
|
| 158 |
-
- Add tool-specific settings (timeouts, max retries, etc.)
|
| 159 |
-
- Add tool feature flags (enable/disable specific tools)
|
| 160 |
-
- Add result size limits
|
| 161 |
-
|
| 162 |
-
**6.2 Update .env.example**
|
| 163 |
-
|
| 164 |
-
- Document any new environment variables
|
| 165 |
-
- Add tool-specific configuration examples
|
| 166 |
-
|
| 167 |
-
### Step 7: Integration Testing
|
| 168 |
-
|
| 169 |
-
**7.1 Create tests/test_tools_integration.py**
|
| 170 |
-
|
| 171 |
-
- Test all tools working together
|
| 172 |
-
- Test tool execution from StateGraph
|
| 173 |
-
- Test error propagation
|
| 174 |
-
- Test retry mechanisms across all tools
|
| 175 |
-
|
| 176 |
-
**7.2 Create test_stage2.py**
|
| 177 |
-
|
| 178 |
-
- End-to-end test with real tool calls
|
| 179 |
-
- Verify StateGraph executes tools correctly
|
| 180 |
-
- Verify results are returned to state
|
| 181 |
-
- Verify errors are handled gracefully
|
| 182 |
-
|
| 183 |
-
### Step 8: Documentation and Deployment
|
| 184 |
-
|
| 185 |
-
**8.1 Update requirements.txt**
|
| 186 |
-
|
| 187 |
-
- Ensure all tool dependencies are included
|
| 188 |
-
- Add tenacity for retry logic
|
| 189 |
-
|
| 190 |
-
**8.2 Local testing**
|
| 191 |
-
|
| 192 |
-
- Run all test suites
|
| 193 |
-
- Test with Gradio UI
|
| 194 |
-
- Verify no regressions from Stage 1
|
| 195 |
-
|
| 196 |
-
**8.3 Deploy to HF Spaces**
|
| 197 |
-
|
| 198 |
-
- Push changes
|
| 199 |
-
- Verify build succeeds
|
| 200 |
-
- Test tools in deployed environment
|
| 201 |
|
| 202 |
## Files to Modify
|
| 203 |
|
| 204 |
-
|
| 205 |
-
|
| 206 |
-
- `src/tools/web_search.py` - Tavily/Exa search implementation
|
| 207 |
-
- `src/tools/file_parser.py` - PDF/Excel/Word/Image parsing
|
| 208 |
-
- `src/tools/calculator.py` - Safe expression evaluation
|
| 209 |
-
- `src/tools/vision.py` - Multimodal image analysis
|
| 210 |
-
- `tests/test_web_search.py` - Web search tests
|
| 211 |
-
- `tests/test_file_parser.py` - File parser tests
|
| 212 |
-
- `tests/test_calculator.py` - Calculator tests
|
| 213 |
-
- `tests/test_vision.py` - Vision tests
|
| 214 |
-
- `tests/test_tools_integration.py` - Integration tests
|
| 215 |
-
- `tests/test_stage2.py` - Stage 2 end-to-end tests
|
| 216 |
-
- `tests/fixtures/` - Test files directory
|
| 217 |
-
|
| 218 |
-
**Existing files to modify:**
|
| 219 |
-
|
| 220 |
-
- `src/tools/__init__.py` - Export all tools, create tool registry
|
| 221 |
-
- `src/agent/graph.py` - Update execute_node to use real tools
|
| 222 |
-
- `src/config/settings.py` - Add tool-specific settings
|
| 223 |
-
- `.env.example` - Document new configuration (if any)
|
| 224 |
-
- `requirements.txt` - Add tenacity for retry logic
|
| 225 |
-
|
| 226 |
-
**Files NOT to modify:**
|
| 227 |
-
|
| 228 |
-
- `src/agent/graph.py` plan_node - Defer to Stage 3
|
| 229 |
-
- `src/agent/graph.py` answer_node - Defer to Stage 3
|
| 230 |
-
- Planning/reasoning logic - Defer to Stage 3
|
| 231 |
|
| 232 |
## Success Criteria
|
| 233 |
|
| 234 |
-
|
| 235 |
-
|
| 236 |
-
- [ ] Web search tool returns valid results from Tavily
|
| 237 |
-
- [ ] Web search falls back to Exa when Tavily fails
|
| 238 |
-
- [ ] File parser handles PDF, Excel, Word files correctly
|
| 239 |
-
- [ ] Calculator evaluates mathematical expressions safely
|
| 240 |
-
- [ ] Vision tool analyzes images using LLM vision capabilities
|
| 241 |
-
- [ ] All tools have retry logic with exponential backoff
|
| 242 |
-
- [ ] All tools handle errors gracefully
|
| 243 |
-
- [ ] Tools integrate with StateGraph execute_node
|
| 244 |
-
|
| 245 |
-
### Technical Requirements
|
| 246 |
-
|
| 247 |
-
- [ ] All tool functions return structured dict responses
|
| 248 |
-
- [ ] Retry logic uses tenacity with max 3 retries
|
| 249 |
-
- [ ] Error messages are clear and actionable
|
| 250 |
-
- [ ] All tools have comprehensive test coverage (>80%)
|
| 251 |
-
- [ ] No unsafe code execution in calculator
|
| 252 |
-
- [ ] Tool timeouts enforced to prevent hangs
|
| 253 |
-
|
| 254 |
-
### Validation Checkpoints
|
| 255 |
-
|
| 256 |
-
- [ ] **Checkpoint 1:** Web search tool working with tests passing
|
| 257 |
-
- [ ] **Checkpoint 2:** File parser working with tests passing
|
| 258 |
-
- [ ] **Checkpoint 3:** Calculator working with tests passing
|
| 259 |
-
- [ ] **Checkpoint 4:** Vision tool working with tests passing
|
| 260 |
-
- [ ] **Checkpoint 5:** All tools integrated with StateGraph
|
| 261 |
-
- [ ] **Checkpoint 6:** Integration tests passing
|
| 262 |
-
- [ ] **Checkpoint 7:** Deployed to HF Spaces successfully
|
| 263 |
-
|
| 264 |
-
### Non-Goals for Stage 2
|
| 265 |
-
|
| 266 |
-
- β Implementing planning logic (Stage 3)
|
| 267 |
-
- β Implementing answer synthesis (Stage 3)
|
| 268 |
-
- β Optimizing tool selection strategy (Stage 3)
|
| 269 |
-
- β Advanced error recovery beyond retries (Stage 4)
|
| 270 |
-
- β Performance optimization (Stage 5)
|
| 271 |
-
|
| 272 |
-
## Dependencies & Risks
|
| 273 |
-
|
| 274 |
-
**Dependencies:**
|
| 275 |
-
|
| 276 |
-
- Tavily API key (free tier: 1000 req/month)
|
| 277 |
-
- Exa API key (paid tier, fallback)
|
| 278 |
-
- LLM vision API access (Gemini/Claude)
|
| 279 |
-
- Test fixtures (sample files for parsing)
|
| 280 |
-
|
| 281 |
-
**Risks:**
|
| 282 |
-
|
| 283 |
-
- **Risk:** API rate limits during testing
|
| 284 |
-
- **Mitigation:** Use mocks for unit tests, real APIs only for integration tests
|
| 285 |
-
- **Risk:** File parsing fails on edge cases
|
| 286 |
-
- **Mitigation:** Comprehensive test fixtures covering various formats
|
| 287 |
-
- **Risk:** Calculator security vulnerabilities
|
| 288 |
-
- **Mitigation:** Strict whitelisting, no eval/exec, use AST parsing only
|
| 289 |
-
- **Risk:** Tool timeout issues on slow networks
|
| 290 |
-
- **Mitigation:** Configurable timeouts, retry logic
|
| 291 |
-
|
| 292 |
-
## Next Steps After Stage 2
|
| 293 |
-
|
| 294 |
-
Once Stage 2 Success Criteria met:
|
| 295 |
-
|
| 296 |
-
1. Create Stage 3 plan (Core Agent Logic - Planning & Reasoning)
|
| 297 |
-
2. Implement plan_node with tool selection strategy
|
| 298 |
-
3. Implement answer_node with result synthesis
|
| 299 |
-
4. Test end-to-end agent behavior
|
| 300 |
-
5. Proceed to Stage 4 (Integration & Robustness)
|
|
|
|
| 1 |
+
# Implementation Plan
|
| 2 |
|
| 3 |
+
**Date:** [YYYY-MM-DD]
|
| 4 |
+
**Dev Record:** [link to dev/dev_YYMMDD_##_concise_title.md]
|
| 5 |
+
**Status:** [Planning | In Progress | Completed]
|
| 6 |
|
| 7 |
## Objective
|
| 8 |
|
| 9 |
+
[Clear goal statement]
|
| 10 |
|
| 11 |
## Steps
|
| 12 |
|
| 13 |
+
[Implementation steps]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
## Files to Modify
|
| 16 |
|
| 17 |
+
[List of files]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
## Success Criteria
|
| 20 |
|
| 21 |
+
[Completion criteria]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TODO.md
CHANGED
|
@@ -1,71 +1,14 @@
|
|
| 1 |
-
# TODO
|
| 2 |
|
| 3 |
-
**
|
| 4 |
-
**
|
| 5 |
-
**Status:** Ready for execution
|
| 6 |
|
| 7 |
-
##
|
| 8 |
|
| 9 |
-
|
| 10 |
-
- [ ]
|
| 11 |
-
- [ ]
|
| 12 |
-
- [ ] Implement fallback mechanism (Tavily β Exa)
|
| 13 |
-
- [ ] Add error handling and logging
|
| 14 |
-
- [ ] Create `tests/test_web_search.py` with mock API tests
|
| 15 |
-
- [ ] Test retry logic and fallback mechanism
|
| 16 |
|
| 17 |
-
|
| 18 |
-
- [ ] Create `src/tools/file_parser.py` with PDF/Excel/Word parsers
|
| 19 |
-
- [ ] Implement generic `parse_file()` dispatcher
|
| 20 |
-
- [ ] Add retry logic for file operations
|
| 21 |
-
- [ ] Add error handling for missing/corrupted files
|
| 22 |
-
- [ ] Create test fixtures in `tests/fixtures/`
|
| 23 |
-
- [ ] Create `tests/test_file_parser.py` with parser tests
|
| 24 |
|
| 25 |
-
|
| 26 |
-
- [ ] Create `src/tools/calculator.py` with safe_eval function
|
| 27 |
-
- [ ] Implement safety checks (whitelist operations, timeout, complexity limits)
|
| 28 |
-
- [ ] Add error handling for syntax/division errors
|
| 29 |
-
- [ ] Create `tests/test_calculator.py` with arithmetic and safety tests
|
| 30 |
-
|
| 31 |
-
### Step 4: Vision Tool
|
| 32 |
-
- [ ] Create `src/tools/vision.py` with image analysis function
|
| 33 |
-
- [ ] Implement image loading and base64 encoding
|
| 34 |
-
- [ ] Integrate with LLM vision API (Gemini/Claude)
|
| 35 |
-
- [ ] Add retry logic for API errors
|
| 36 |
-
- [ ] Create test image fixtures
|
| 37 |
-
- [ ] Create `tests/test_vision.py` with mock LLM tests
|
| 38 |
-
|
| 39 |
-
### Step 5: StateGraph Integration
|
| 40 |
-
- [ ] Update `src/tools/__init__.py` to export all tools
|
| 41 |
-
- [ ] Create unified tool registry with metadata
|
| 42 |
-
- [ ] Update `src/agent/graph.py` execute_node to use real tools
|
| 43 |
-
- [ ] Implement `execute_tool()` wrapper with logging and timeout
|
| 44 |
-
- [ ] Test tool execution from StateGraph
|
| 45 |
-
|
| 46 |
-
### Step 6: Configuration Updates
|
| 47 |
-
- [ ] Update `src/config/settings.py` with tool-specific settings
|
| 48 |
-
- [ ] Add tool feature flags and timeouts
|
| 49 |
-
- [ ] Update `.env.example` with new configuration (if needed)
|
| 50 |
-
|
| 51 |
-
### Step 7: Integration Testing
|
| 52 |
-
- [ ] Create `tests/test_tools_integration.py` for cross-tool tests
|
| 53 |
-
- [ ] Create `tests/test_stage2.py` for end-to-end validation
|
| 54 |
-
- [ ] Test error propagation and retry mechanisms
|
| 55 |
-
- [ ] Verify StateGraph executes all tools correctly
|
| 56 |
-
|
| 57 |
-
### Step 8: Deployment
|
| 58 |
-
- [ ] Add `tenacity` to requirements.txt
|
| 59 |
-
- [ ] Run all test suites locally
|
| 60 |
-
- [ ] Test with Gradio UI
|
| 61 |
-
- [ ] Verify no regressions from Stage 1
|
| 62 |
-
- [ ] Push changes to HF Spaces
|
| 63 |
-
- [ ] Verify deployment build succeeds
|
| 64 |
-
- [ ] Test tools in deployed environment
|
| 65 |
-
|
| 66 |
-
## Notes
|
| 67 |
-
|
| 68 |
-
- All tools use direct API approach (not MCP servers)
|
| 69 |
-
- HF Spaces deployment compatibility is priority
|
| 70 |
-
- Mock APIs for unit tests, real APIs for integration tests only
|
| 71 |
-
- Each checkpoint should pass before moving to next step
|
|
|
|
| 1 |
+
# TODO List
|
| 2 |
|
| 3 |
+
**Session Date:** [YYYY-MM-DD]
|
| 4 |
+
**Dev Record:** [link to dev/dev_YYMMDD_##_concise_title.md]
|
|
|
|
| 5 |
|
| 6 |
+
## Active Tasks
|
| 7 |
|
| 8 |
+
- [ ] [Task 1]
|
| 9 |
+
- [ ] [Task 2]
|
| 10 |
+
- [ ] [Task 3]
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
+
## Completed Tasks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
+
- [x] [Completed task 1]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
dev/dev_260102_13_stage2_tool_development.md
ADDED
|
@@ -0,0 +1,280 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# [dev_260102_13] Stage 2: Tool Development Complete
|
| 2 |
+
|
| 3 |
+
**Date:** 2026-01-02
|
| 4 |
+
**Type:** Development
|
| 5 |
+
**Status:** Resolved
|
| 6 |
+
**Related Dev:** dev_260101_11 (Stage 1 Foundation Setup)
|
| 7 |
+
|
| 8 |
+
## Problem Description
|
| 9 |
+
|
| 10 |
+
Stage 1 established the LangGraph StateGraph skeleton with placeholder nodes. Stage 2 needed to implement the actual tools that the agent would use to answer GAIA benchmark questions, including web search, file parsing, mathematical computation, and multimodal image analysis.
|
| 11 |
+
|
| 12 |
+
**Root cause:** GAIA questions require external tool use (web search, file reading, calculations, image analysis). Stage 1 had no actual tool implementations - just placeholders.
|
| 13 |
+
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
## Key Decisions
|
| 17 |
+
|
| 18 |
+
### Decision 1: Direct API Implementation vs MCP Servers
|
| 19 |
+
|
| 20 |
+
**Chosen:** Direct Python function implementations for all tools
|
| 21 |
+
|
| 22 |
+
**Why:**
|
| 23 |
+
- HuggingFace Spaces doesn't support running MCP servers (requires separate processes)
|
| 24 |
+
- Direct API approach is simpler and more reliable for deployment
|
| 25 |
+
- Full control over retry logic, error handling, and timeouts
|
| 26 |
+
- MCP servers are external dependencies with additional failure points
|
| 27 |
+
|
| 28 |
+
**Rejected alternative:** Using MCP protocol servers for Tavily/Exa
|
| 29 |
+
- Would require complex Docker configuration on HF Spaces
|
| 30 |
+
- Additional process management overhead
|
| 31 |
+
- Not necessary for MVP stage
|
| 32 |
+
|
| 33 |
+
### Decision 2: Retry Logic with Tenacity
|
| 34 |
+
|
| 35 |
+
**Chosen:** Use `tenacity` library with exponential backoff, max 3 retries
|
| 36 |
+
|
| 37 |
+
**Why:**
|
| 38 |
+
- Industry-standard retry library with clean decorator syntax
|
| 39 |
+
- Exponential backoff prevents API rate limit issues
|
| 40 |
+
- Configurable retry conditions (only retry on connection errors, not on validation errors)
|
| 41 |
+
- Easy to test with mocking
|
| 42 |
+
|
| 43 |
+
**Configuration:**
|
| 44 |
+
- Max retries: 3
|
| 45 |
+
- Min wait: 1 second
|
| 46 |
+
- Max wait: 10 seconds
|
| 47 |
+
- Retry only on: ConnectionError, TimeoutError, IOError (for file operations)
|
| 48 |
+
|
| 49 |
+
### Decision 3: Tool Architecture - Unified Functions with Fallback
|
| 50 |
+
|
| 51 |
+
**Pattern applied to all tools:**
|
| 52 |
+
- Primary implementation (e.g., `tavily_search`)
|
| 53 |
+
- Fallback implementation (e.g., `exa_search`)
|
| 54 |
+
- Unified function with automatic fallback (e.g., `search`)
|
| 55 |
+
|
| 56 |
+
**Example:**
|
| 57 |
+
```python
|
| 58 |
+
def search(query):
|
| 59 |
+
if default_tool == "tavily":
|
| 60 |
+
try:
|
| 61 |
+
return tavily_search(query)
|
| 62 |
+
except:
|
| 63 |
+
return exa_search(query) # Fallback
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
**Why:** Maximizes reliability - if primary service fails, automatic fallback ensures tool still works
|
| 67 |
+
|
| 68 |
+
### Decision 4: Calculator Security - AST-based Evaluation
|
| 69 |
+
|
| 70 |
+
**Chosen:** Custom AST visitor with whitelisted operations only
|
| 71 |
+
|
| 72 |
+
**Why:**
|
| 73 |
+
- Python's `eval()` is dangerous (arbitrary code execution)
|
| 74 |
+
- `ast.literal_eval()` is too restrictive (doesn't support math operations)
|
| 75 |
+
- Custom AST visitor allows precise control over allowed operations
|
| 76 |
+
- Timeout protection prevents infinite loops
|
| 77 |
+
- Whitelist approach: only allow known-safe operations (add, multiply, sin, cos, etc.)
|
| 78 |
+
|
| 79 |
+
**Rejected alternatives:**
|
| 80 |
+
- Using `eval()`: Major security vulnerability
|
| 81 |
+
- Using `sympify()` from sympy: Too complex, allows too much
|
| 82 |
+
|
| 83 |
+
**Security layers:**
|
| 84 |
+
1. AST whitelist (only allow specific node types)
|
| 85 |
+
2. Expression length limit (500 chars)
|
| 86 |
+
3. Number size limit (prevent huge calculations)
|
| 87 |
+
4. Timeout protection (2 seconds max)
|
| 88 |
+
5. No attribute access, no imports, no exec/eval
|
| 89 |
+
|
| 90 |
+
### Decision 5: File Parser - Generic Dispatcher Pattern
|
| 91 |
+
|
| 92 |
+
**Chosen:** Single `parse_file()` function that dispatches based on extension
|
| 93 |
+
|
| 94 |
+
```python
|
| 95 |
+
def parse_file(file_path):
|
| 96 |
+
extension = Path(file_path).suffix.lower()
|
| 97 |
+
if extension == '.pdf':
|
| 98 |
+
return parse_pdf(file_path)
|
| 99 |
+
elif extension in ['.xlsx', '.xls']:
|
| 100 |
+
return parse_excel(file_path)
|
| 101 |
+
# ... etc
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
**Why:**
|
| 105 |
+
- Simple interface for users (one function for all file types)
|
| 106 |
+
- Easy to add new file types (just add new parser and update dispatcher)
|
| 107 |
+
- Each parser can have format-specific logic
|
| 108 |
+
- Fallback to specific parsers still available for advanced use
|
| 109 |
+
|
| 110 |
+
### Decision 6: Vision Tool - Gemini as Default with Claude Fallback
|
| 111 |
+
|
| 112 |
+
**Chosen:** Gemini 2.0 Flash as primary, Claude Sonnet 4.5 as fallback
|
| 113 |
+
|
| 114 |
+
**Why:**
|
| 115 |
+
- Gemini 2.0 Flash: Free tier (1500 req/day), fast, good quality
|
| 116 |
+
- Claude Sonnet 4.5: Paid but highest quality, automatic fallback if Gemini fails
|
| 117 |
+
- Same pattern as web search (primary + fallback = reliability)
|
| 118 |
+
|
| 119 |
+
**Image handling:**
|
| 120 |
+
- Load file, encode as base64
|
| 121 |
+
- Check file size (max 10MB)
|
| 122 |
+
- Support common formats (JPG, PNG, GIF, WEBP, BMP)
|
| 123 |
+
- Return structured answer with model metadata
|
| 124 |
+
|
| 125 |
+
## Outcome
|
| 126 |
+
|
| 127 |
+
Successfully implemented 4 production-ready tools with comprehensive error handling and test coverage.
|
| 128 |
+
|
| 129 |
+
**Deliverables:**
|
| 130 |
+
|
| 131 |
+
1. **Web Search Tool** ([src/tools/web_search.py](../src/tools/web_search.py))
|
| 132 |
+
- Tavily API integration (primary, free tier)
|
| 133 |
+
- Exa API integration (fallback, paid)
|
| 134 |
+
- Automatic fallback if primary fails
|
| 135 |
+
- 10 passing tests (mock API, retry logic, fallback mechanism)
|
| 136 |
+
|
| 137 |
+
2. **File Parser Tool** ([src/tools/file_parser.py](../src/tools/file_parser.py))
|
| 138 |
+
- PDF parsing (PyPDF2)
|
| 139 |
+
- Excel parsing (openpyxl)
|
| 140 |
+
- Word parsing (python-docx)
|
| 141 |
+
- Text/CSV parsing (built-in open)
|
| 142 |
+
- Generic `parse_file()` dispatcher
|
| 143 |
+
- 19 passing tests (real files + error handling)
|
| 144 |
+
|
| 145 |
+
3. **Calculator Tool** ([src/tools/calculator.py](../src/tools/calculator.py))
|
| 146 |
+
- Safe AST-based expression evaluation
|
| 147 |
+
- Whitelisted operations only (no code execution)
|
| 148 |
+
- Mathematical functions (sin, cos, sqrt, factorial, etc.)
|
| 149 |
+
- Security hardened (timeout, complexity limits)
|
| 150 |
+
- 41 passing tests (arithmetic, functions, security)
|
| 151 |
+
|
| 152 |
+
4. **Vision Tool** ([src/tools/vision.py](../src/tools/vision.py))
|
| 153 |
+
- Multimodal image analysis using LLMs
|
| 154 |
+
- Gemini 2.0 Flash (primary, free)
|
| 155 |
+
- Claude Sonnet 4.5 (fallback, paid)
|
| 156 |
+
- Image loading and base64 encoding
|
| 157 |
+
- 15 passing tests (mock LLM responses)
|
| 158 |
+
|
| 159 |
+
5. **Tool Registry** ([src/tools/__init__.py](../src/tools/__init__.py))
|
| 160 |
+
- Exports all 4 main tools: `search`, `parse_file`, `safe_eval`, `analyze_image`
|
| 161 |
+
- TOOLS dict with metadata (description, parameters, category)
|
| 162 |
+
- Ready for Stage 3 dynamic tool selection
|
| 163 |
+
|
| 164 |
+
6. **StateGraph Integration** ([src/agent/graph.py](../src/agent/graph.py))
|
| 165 |
+
- Updated `execute_node` to load tool registry
|
| 166 |
+
- Stage 2: Reports tool availability
|
| 167 |
+
- Stage 3: Will add dynamic tool selection and execution
|
| 168 |
+
|
| 169 |
+
**Test Coverage:**
|
| 170 |
+
- 85 tool tests passing (web_search: 10, file_parser: 19, calculator: 41, vision: 15)
|
| 171 |
+
- 6 existing agent tests still passing
|
| 172 |
+
- 91 total tests passing
|
| 173 |
+
- No regressions from Stage 1
|
| 174 |
+
|
| 175 |
+
**Deployment:**
|
| 176 |
+
- All changes committed and pushed to HuggingFace Spaces
|
| 177 |
+
- Build succeeded
|
| 178 |
+
- Agent now reports: "Stage 2 complete: 4 tools ready for execution in Stage 3"
|
| 179 |
+
|
| 180 |
+
## Learnings and Insights
|
| 181 |
+
|
| 182 |
+
### Pattern: Unified Function with Fallback
|
| 183 |
+
|
| 184 |
+
This pattern worked extremely well for both web search and vision tools:
|
| 185 |
+
|
| 186 |
+
```python
|
| 187 |
+
def tool_name(args):
|
| 188 |
+
# Try primary service
|
| 189 |
+
try:
|
| 190 |
+
return primary_implementation(args)
|
| 191 |
+
except Exception as e:
|
| 192 |
+
logger.warning(f"Primary failed: {e}")
|
| 193 |
+
# Fallback to secondary
|
| 194 |
+
try:
|
| 195 |
+
return fallback_implementation(args)
|
| 196 |
+
except Exception as fallback_error:
|
| 197 |
+
raise Exception(f"Both failed")
|
| 198 |
+
```
|
| 199 |
+
|
| 200 |
+
**Why it works:**
|
| 201 |
+
- Maximizes reliability (2 chances to succeed)
|
| 202 |
+
- Transparent to users (single function call)
|
| 203 |
+
- Preserves cost optimization (use free tier first, paid only as fallback)
|
| 204 |
+
|
| 205 |
+
**Recommendation:** Use this pattern for any tool with multiple service providers.
|
| 206 |
+
|
| 207 |
+
### Pattern: Test Fixtures for File Parsers
|
| 208 |
+
|
| 209 |
+
Creating real test fixtures (sample.pdf, sample.xlsx, etc.) was critical for file parser testing:
|
| 210 |
+
|
| 211 |
+
**What worked:**
|
| 212 |
+
- Tests are realistic (test actual file parsing, not just mocks)
|
| 213 |
+
- Easy to add new test cases (just add new fixture files)
|
| 214 |
+
- Catches edge cases that mocks miss
|
| 215 |
+
|
| 216 |
+
**Created fixtures:**
|
| 217 |
+
- `tests/fixtures/sample.txt` - Plain text
|
| 218 |
+
- `tests/fixtures/sample.csv` - CSV data
|
| 219 |
+
- `tests/fixtures/sample.xlsx` - Excel spreadsheet
|
| 220 |
+
- `tests/fixtures/sample.docx` - Word document
|
| 221 |
+
- `tests/fixtures/test_image.jpg` - Test image (red square)
|
| 222 |
+
- `tests/fixtures/generate_fixtures.py` - Script to regenerate fixtures
|
| 223 |
+
|
| 224 |
+
**Recommendation:** For any file processing tool, create comprehensive fixture library.
|
| 225 |
+
|
| 226 |
+
### What Worked Well: Mock Path for Import Testing
|
| 227 |
+
|
| 228 |
+
Initially had issues with mock paths like `src.tools.vision.genai.Client`. The fix:
|
| 229 |
+
|
| 230 |
+
```python
|
| 231 |
+
# WRONG: src.tools.vision.genai.Client
|
| 232 |
+
# RIGHT: google.genai.Client
|
| 233 |
+
with patch('google.genai.Client') as mock_client:
|
| 234 |
+
# Mock the original import, not the re-export
|
| 235 |
+
```
|
| 236 |
+
|
| 237 |
+
**Lesson:** Always mock the original module path, not where it's imported into your code.
|
| 238 |
+
|
| 239 |
+
### What to Avoid: Premature Integration Testing
|
| 240 |
+
|
| 241 |
+
Initially planned to create `tests/test_tools_integration.py` for cross-tool testing. **Decision:** Skip for Stage 2.
|
| 242 |
+
|
| 243 |
+
**Why:**
|
| 244 |
+
- Tools work independently (don't need to interact yet)
|
| 245 |
+
- Integration testing makes sense in Stage 3 when tools are orchestrated
|
| 246 |
+
- Unit tests provide sufficient coverage for Stage 2
|
| 247 |
+
|
| 248 |
+
**Recommendation:** Only write integration tests when components actually integrate. Don't test imaginary integration.
|
| 249 |
+
|
| 250 |
+
## Changelog
|
| 251 |
+
|
| 252 |
+
**What was created:**
|
| 253 |
+
|
| 254 |
+
- `src/tools/web_search.py` - Tavily/Exa web search with retry logic
|
| 255 |
+
- `src/tools/file_parser.py` - PDF/Excel/Word/Text parsing with retry logic
|
| 256 |
+
- `src/tools/calculator.py` - Safe AST-based math evaluation
|
| 257 |
+
- `src/tools/vision.py` - Multimodal image analysis (Gemini/Claude)
|
| 258 |
+
- `tests/test_web_search.py` - 10 tests for web search tool
|
| 259 |
+
- `tests/test_file_parser.py` - 19 tests for file parser
|
| 260 |
+
- `tests/test_calculator.py` - 41 tests for calculator (including security)
|
| 261 |
+
- `tests/test_vision.py` - 15 tests for vision tool
|
| 262 |
+
- `tests/fixtures/sample.txt` - Test text file
|
| 263 |
+
- `tests/fixtures/sample.csv` - Test CSV file
|
| 264 |
+
- `tests/fixtures/sample.xlsx` - Test Excel file
|
| 265 |
+
- `tests/fixtures/sample.docx` - Test Word document
|
| 266 |
+
- `tests/fixtures/test_image.jpg` - Test image
|
| 267 |
+
- `tests/fixtures/generate_fixtures.py` - Fixture generation script
|
| 268 |
+
|
| 269 |
+
**What was modified:**
|
| 270 |
+
|
| 271 |
+
- `src/tools/__init__.py` - Added tool exports and TOOLS registry
|
| 272 |
+
- `src/agent/graph.py` - Updated execute_node to load tool registry
|
| 273 |
+
- `requirements.txt` - Added `tenacity>=8.2.0` for retry logic
|
| 274 |
+
- `pyproject.toml` - Installed tenacity, fpdf2, defusedxml packages
|
| 275 |
+
- `PLAN.md` - Emptied for next stage
|
| 276 |
+
- `TODO.md` - Emptied for next stage
|
| 277 |
+
|
| 278 |
+
**What was deleted:**
|
| 279 |
+
|
| 280 |
+
- None (Stage 2 was purely additive)
|
src/tools/web_search.py
CHANGED
|
@@ -39,6 +39,7 @@ logger = logging.getLogger(__name__)
|
|
| 39 |
# Tavily Search Implementation
|
| 40 |
# ============================================================================
|
| 41 |
|
|
|
|
| 42 |
@retry(
|
| 43 |
stop=stop_after_attempt(MAX_RETRIES),
|
| 44 |
wait=wait_exponential(multiplier=1, min=RETRY_MIN_WAIT, max=RETRY_MAX_WAIT),
|
|
@@ -83,11 +84,13 @@ def tavily_search(query: str, max_results: int = DEFAULT_MAX_RESULTS) -> Dict:
|
|
| 83 |
# Extract and structure results
|
| 84 |
results = []
|
| 85 |
for item in response.get("results", []):
|
| 86 |
-
results.append(
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
|
|
|
|
|
|
| 91 |
|
| 92 |
logger.info(f"Tavily search successful: {len(results)} results")
|
| 93 |
|
|
@@ -113,6 +116,7 @@ def tavily_search(query: str, max_results: int = DEFAULT_MAX_RESULTS) -> Dict:
|
|
| 113 |
# Exa Search Implementation
|
| 114 |
# ============================================================================
|
| 115 |
|
|
|
|
| 116 |
@retry(
|
| 117 |
stop=stop_after_attempt(MAX_RETRIES),
|
| 118 |
wait=wait_exponential(multiplier=1, min=RETRY_MIN_WAIT, max=RETRY_MAX_WAIT),
|
|
@@ -152,16 +156,20 @@ def exa_search(query: str, max_results: int = DEFAULT_MAX_RESULTS) -> Dict:
|
|
| 152 |
logger.info(f"Exa search: query='{query}', max_results={max_results}")
|
| 153 |
|
| 154 |
client = Exa(api_key=api_key)
|
| 155 |
-
response = client.search(
|
|
|
|
|
|
|
| 156 |
|
| 157 |
# Extract and structure results
|
| 158 |
results = []
|
| 159 |
for item in response.results:
|
| 160 |
-
results.append(
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
|
|
|
|
|
|
| 165 |
|
| 166 |
logger.info(f"Exa search successful: {len(results)} results")
|
| 167 |
|
|
@@ -187,6 +195,7 @@ def exa_search(query: str, max_results: int = DEFAULT_MAX_RESULTS) -> Dict:
|
|
| 187 |
# Unified Search with Fallback
|
| 188 |
# ============================================================================
|
| 189 |
|
|
|
|
| 190 |
def search(query: str, max_results: int = DEFAULT_MAX_RESULTS) -> Dict:
|
| 191 |
"""
|
| 192 |
Unified search function with automatic fallback.
|
|
|
|
| 39 |
# Tavily Search Implementation
|
| 40 |
# ============================================================================
|
| 41 |
|
| 42 |
+
|
| 43 |
@retry(
|
| 44 |
stop=stop_after_attempt(MAX_RETRIES),
|
| 45 |
wait=wait_exponential(multiplier=1, min=RETRY_MIN_WAIT, max=RETRY_MAX_WAIT),
|
|
|
|
| 84 |
# Extract and structure results
|
| 85 |
results = []
|
| 86 |
for item in response.get("results", []):
|
| 87 |
+
results.append(
|
| 88 |
+
{
|
| 89 |
+
"title": item.get("title", ""),
|
| 90 |
+
"url": item.get("url", ""),
|
| 91 |
+
"snippet": item.get("content", ""),
|
| 92 |
+
}
|
| 93 |
+
)
|
| 94 |
|
| 95 |
logger.info(f"Tavily search successful: {len(results)} results")
|
| 96 |
|
|
|
|
| 116 |
# Exa Search Implementation
|
| 117 |
# ============================================================================
|
| 118 |
|
| 119 |
+
|
| 120 |
@retry(
|
| 121 |
stop=stop_after_attempt(MAX_RETRIES),
|
| 122 |
wait=wait_exponential(multiplier=1, min=RETRY_MIN_WAIT, max=RETRY_MAX_WAIT),
|
|
|
|
| 156 |
logger.info(f"Exa search: query='{query}', max_results={max_results}")
|
| 157 |
|
| 158 |
client = Exa(api_key=api_key)
|
| 159 |
+
response = client.search(
|
| 160 |
+
query=query, num_results=max_results, use_autoprompt=True
|
| 161 |
+
)
|
| 162 |
|
| 163 |
# Extract and structure results
|
| 164 |
results = []
|
| 165 |
for item in response.results:
|
| 166 |
+
results.append(
|
| 167 |
+
{
|
| 168 |
+
"title": item.title if hasattr(item, "title") else "",
|
| 169 |
+
"url": item.url if hasattr(item, "url") else "",
|
| 170 |
+
"snippet": item.text if hasattr(item, "text") else "",
|
| 171 |
+
}
|
| 172 |
+
)
|
| 173 |
|
| 174 |
logger.info(f"Exa search successful: {len(results)} results")
|
| 175 |
|
|
|
|
| 195 |
# Unified Search with Fallback
|
| 196 |
# ============================================================================
|
| 197 |
|
| 198 |
+
|
| 199 |
def search(query: str, max_results: int = DEFAULT_MAX_RESULTS) -> Dict:
|
| 200 |
"""
|
| 201 |
Unified search function with automatic fallback.
|
{tests β test}/README.md
RENAMED
|
File without changes
|
{tests β test}/__init__.py
RENAMED
|
File without changes
|
{tests β test}/fixtures/generate_fixtures.py
RENAMED
|
File without changes
|
{tests β test}/fixtures/sample.csv
RENAMED
|
File without changes
|
{tests β test}/fixtures/sample.docx
RENAMED
|
File without changes
|
{tests β test}/fixtures/sample.txt
RENAMED
|
File without changes
|
{tests β test}/fixtures/sample.xlsx
RENAMED
|
File without changes
|
{tests β test}/fixtures/test_image.jpg
RENAMED
|
File without changes
|
{tests β test}/test_agent_basic.py
RENAMED
|
File without changes
|
{tests β test}/test_calculator.py
RENAMED
|
File without changes
|
{tests β test}/test_file_parser.py
RENAMED
|
File without changes
|
{tests β test}/test_stage1.py
RENAMED
|
File without changes
|
{tests β test}/test_vision.py
RENAMED
|
File without changes
|
{tests β test}/test_web_search.py
RENAMED
|
File without changes
|