File size: 8,923 Bytes
07c2476
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
# Testing (still under development)

The AI Imaging Agent uses pytest for testing. This guide covers running tests and writing new ones.

**Note:** We are still developing some tests for the agent, hence this part is not relevant for now.

## Running Tests

### Basic Usage

```bash

# Run all tests

pytest



# Run specific test file

pytest tests/test_retrieval_pipeline.py



# Run specific test

pytest tests/test_retrieval_pipeline.py::test_basic_retrieval



# Run with verbose output

pytest -v



# Run with coverage

pytest --cov=ai_agent --cov-report=html

```

### Test Categories

Tests are marked by category:

```bash

# Run only unit tests

pytest -m unit



# Run only integration tests

pytest -m integration



# Skip slow tests

pytest -m "not slow"

```

## Test Organization

### Directory Structure

```

tests/

β”œβ”€β”€ data/

β”‚   β”œβ”€β”€ test_data.json         # Test cases

β”‚   └── 0002.DCM               # Sample DICOM file

β”œβ”€β”€ test_retrieval_pipeline.py # Retrieval tests

β”œβ”€β”€ test_deepwiki_repo_info.py # Repo info tests

β”œβ”€β”€ test_gpt4o_vision.py       # VLM tests (integration)

└── __pycache__/

```

### Test File Naming

- `test_*.py`: Test files
- `*_test.py`: Alternative naming (less common)

### Test Function Naming

```python

def test_basic_retrieval():

    """Test basic retrieval functionality."""

    pass



def test_edge_case_empty_query():

    """Test handling of empty query."""

    pass



def test_integration_full_pipeline():

    """Integration test for complete pipeline."""

    pass

```

## Writing Tests

### Unit Test Example

```python

import pytest

from ai_agent.retriever.vector_index import VectorIndex



def test_vector_index_search():

    """Test FAISS vector search."""

    # Arrange

    index = VectorIndex()

    index.load("artifacts/rag_index")

    

    query = "segment lungs CT"

    

    # Act

    results = index.search(query, k=5)

    

    # Assert

    assert len(results) == 5

    assert all(r['score'] > 0 for r in results)

    assert 'TotalSegmentator' in [r['name'] for r in results]

```

### Integration Test Example

```python

import pytest

from ai_agent.api.pipeline import RAGImagingPipeline



@pytest.mark.integration

def test_full_pipeline_with_image():

    """Integration test with real image and VLM call."""

    # Arrange

    pipeline = RAGImagingPipeline(

        catalog_path="dataset/catalog.jsonl",

        index_dir="artifacts/rag_index"

    )

    

    # Act

    result = pipeline.recommend(

        query="segment lungs",

        files=["tests/data/chest_ct.dcm"]

    )

    

    # Assert

    assert result.status == "complete"

    assert len(result.recommendations) > 0

    assert result.recommendations[0].accuracy_score > 70

```

### Parametrized Tests

```python

@pytest.mark.parametrize("query,expected_tool", [

    ("segment brain MRI", "FreeSurfer"),

    ("segment lungs CT", "TotalSegmentator"),

    ("classify chest X-ray", "CheXNet"),

])

def test_retrieval_for_queries(query, expected_tool):

    """Test retrieval returns expected tools for various queries."""

    index = VectorIndex()

    index.load("artifacts/rag_index")

    

    results = index.search(query, k=10)

    tool_names = [r['name'] for r in results]

    

    assert expected_tool in tool_names

```

### Fixtures

```python

import pytest



@pytest.fixture

def pipeline():

    """Provide initialized pipeline for tests."""

    return RAGImagingPipeline(

        catalog_path="dataset/catalog.jsonl",

        index_dir="artifacts/rag_index"

    )



@pytest.fixture

def sample_dicom():

    """Provide path to sample DICOM file."""

    return "tests/data/0002.DCM"



def test_with_fixtures(pipeline, sample_dicom):

    """Test using fixtures."""

    result = pipeline.recommend(

        query="analyze DICOM",

        files=[sample_dicom]

    )

    assert result is not None

```

<!-- ## Mocking

### Mocking VLM Calls

To avoid API costs during testing:

```python

from unittest.mock import Mock, patch

import pytest



@pytest.fixture

def mock_vlm_response():

    """Mock VLM response."""

    return {

        "status": "complete",

        "recommendations": [

            {

                "rank": 1,

                "name": "TotalSegmentator",

                "accuracy_score": 95,

                "explanation": "Test explanation",

                "reason": "task_match"

            }

        ]

    }



def test_with_mocked_vlm(mock_vlm_response):

    """Test pipeline with mocked VLM."""

    with patch('ai_agent.agent.agent.Agent.run') as mock_run:

        mock_run.return_value = mock_vlm_response

        

        # Test code here

        result = pipeline.recommend(query="test", files=[])

        

        assert result["status"] == "complete"

```

### Mocking File Operations

```python

def test_file_validation():

    """Test file validation without real files."""

    with patch('os.path.getsize') as mock_size:

        mock_size.return_value = 1024 * 1024  # 1 MB

        

        from ai_agent.utils.file_validator import validate_file

        is_valid = validate_file("fake.dcm")

        

        assert is_valid

``` -->



## Test Data



### Using Test Cases



Load test cases from JSON:



```python

import json



def load_test_cases():

    """Load test cases from data file."""

    with open("tests/data/test_data.json") as f:

        return json.load(f)



@pytest.mark.parametrize("test_case", load_test_cases())

def test_from_json(test_case):

    """Test using cases from JSON file."""

    query = test_case["query"]

    expected = test_case["expected_tool"]

    

    # Test logic here

    assert expected in results

```

### Sample Data Files

Keep sample files small:

- **DICOM**: Single slice, low resolution
- **NIfTI**: Small volume (e.g., 64Γ—64Γ—64)
- **Images**: PNG/JPG under 1 MB

<!-- ## Coverage

### Measuring Coverage

```bash

# Run with coverage

pytest --cov=ai_agent



# Generate HTML report

pytest --cov=ai_agent --cov-report=html



# Open report

open htmlcov/index.html  # macOS

# or

xdg-open htmlcov/index.html  # Linux

```

### Coverage Goals

Aim for:

- **Overall**: >80%
- **Critical paths**: >90% (retrieval, agent, pipeline)
- **Utilities**: >70%

### Coverage Configuration

In `pyproject.toml`:

```toml

[tool.coverage.run]

source = ["src/ai_agent"]

omit = ["tests/*", "*/migrations/*"]



[tool.coverage.report]

precision = 2

show_missing = true

skip_covered = false

``` -->



## Continuous Integration



### GitHub Actions



Tests run automatically on:



- Pull requests

- Pushes to main



### CI Configuration



```yaml

# .github/workflows/test.yml

name: Tests



on: [push, pull_request]



jobs:

  test:

    runs-on: ubuntu-latest

    steps:

      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5

        with:

          python-version: '3.10'

      - run: pip install -e ".[dev]"

      - run: pytest --cov=ai_agent

```

## Best Practices

### Do's

βœ… **Test edge cases**: Empty inputs, invalid data, etc.  
βœ… **Test error handling**: Verify exceptions are caught  
βœ… **Use descriptive names**: `test_retrieval_with_empty_query` not `test1`  
βœ… **Keep tests isolated**: Each test should be independent  
βœ… **Use fixtures**: Avoid repeating setup code  
βœ… **Mock expensive operations**: VLM calls, network requests

### Don'ts

❌ **Don't test implementation details**: Test behavior, not internal state  
❌ **Don't make tests depend on each other**: Each should run independently  
❌ **Don't commit large test files**: Keep test data small  
❌ **Don't skip error checking**: Test both success and failure paths  

## Performance Testing

### Benchmarking

Use pytest-benchmark:

```python

def test_retrieval_performance(benchmark):

    """Benchmark retrieval speed."""

    index = VectorIndex()

    index.load("artifacts/rag_index")

    

    result = benchmark(index.search, "segment lungs", k=10)

    

    assert len(result) == 10

```

### Profiling

```bash

# Profile tests

pytest --profile



# Generate SVG profile

pytest --profile-svg

```

## Debugging Tests

### Running in Debug Mode

```python

# Add to test

import pdb; pdb.set_trace()



# Run pytest

pytest tests/test_file.py

```

### Verbose Output

```bash

# Show print statements

pytest -s



# Very verbose

pytest -vv



# Show local variables on failure

pytest -l

```

### Running Single Test

```bash

# Run one test function

pytest tests/test_file.py::test_function_name -v

```

## Next Steps

- Review [Project Structure](structure.md)
- Read [Contributing Guide](contributing.md)
- Explore [Architecture](../architecture/overview.md)