# Testing (still under development) The AI Imaging Agent uses pytest for testing. This guide covers running tests and writing new ones. **Note:** We are still developing some tests for the agent, hence this part is not relevant for now. ## Running Tests ### Basic Usage ```bash # Run all tests pytest # Run specific test file pytest tests/test_retrieval_pipeline.py # Run specific test pytest tests/test_retrieval_pipeline.py::test_basic_retrieval # Run with verbose output pytest -v # Run with coverage pytest --cov=ai_agent --cov-report=html ``` ### Test Categories Tests are marked by category: ```bash # Run only unit tests pytest -m unit # Run only integration tests pytest -m integration # Skip slow tests pytest -m "not slow" ``` ## Test Organization ### Directory Structure ``` tests/ ├── data/ │ ├── test_data.json # Test cases │ └── 0002.DCM # Sample DICOM file ├── test_retrieval_pipeline.py # Retrieval tests ├── test_deepwiki_repo_info.py # Repo info tests ├── test_gpt4o_vision.py # VLM tests (integration) └── __pycache__/ ``` ### Test File Naming - `test_*.py`: Test files - `*_test.py`: Alternative naming (less common) ### Test Function Naming ```python def test_basic_retrieval(): """Test basic retrieval functionality.""" pass def test_edge_case_empty_query(): """Test handling of empty query.""" pass def test_integration_full_pipeline(): """Integration test for complete pipeline.""" pass ``` ## Writing Tests ### Unit Test Example ```python import pytest from ai_agent.retriever.vector_index import VectorIndex def test_vector_index_search(): """Test FAISS vector search.""" # Arrange index = VectorIndex() index.load("artifacts/rag_index") query = "segment lungs CT" # Act results = index.search(query, k=5) # Assert assert len(results) == 5 assert all(r['score'] > 0 for r in results) assert 'TotalSegmentator' in [r['name'] for r in results] ``` ### Integration Test Example ```python import pytest from ai_agent.api.pipeline import RAGImagingPipeline @pytest.mark.integration def test_full_pipeline_with_image(): """Integration test with real image and VLM call.""" # Arrange pipeline = RAGImagingPipeline( catalog_path="dataset/catalog.jsonl", index_dir="artifacts/rag_index" ) # Act result = pipeline.recommend( query="segment lungs", files=["tests/data/chest_ct.dcm"] ) # Assert assert result.status == "complete" assert len(result.recommendations) > 0 assert result.recommendations[0].accuracy_score > 70 ``` ### Parametrized Tests ```python @pytest.mark.parametrize("query,expected_tool", [ ("segment brain MRI", "FreeSurfer"), ("segment lungs CT", "TotalSegmentator"), ("classify chest X-ray", "CheXNet"), ]) def test_retrieval_for_queries(query, expected_tool): """Test retrieval returns expected tools for various queries.""" index = VectorIndex() index.load("artifacts/rag_index") results = index.search(query, k=10) tool_names = [r['name'] for r in results] assert expected_tool in tool_names ``` ### Fixtures ```python import pytest @pytest.fixture def pipeline(): """Provide initialized pipeline for tests.""" return RAGImagingPipeline( catalog_path="dataset/catalog.jsonl", index_dir="artifacts/rag_index" ) @pytest.fixture def sample_dicom(): """Provide path to sample DICOM file.""" return "tests/data/0002.DCM" def test_with_fixtures(pipeline, sample_dicom): """Test using fixtures.""" result = pipeline.recommend( query="analyze DICOM", files=[sample_dicom] ) assert result is not None ``` ## Test Data ### Using Test Cases Load test cases from JSON: ```python import json def load_test_cases(): """Load test cases from data file.""" with open("tests/data/test_data.json") as f: return json.load(f) @pytest.mark.parametrize("test_case", load_test_cases()) def test_from_json(test_case): """Test using cases from JSON file.""" query = test_case["query"] expected = test_case["expected_tool"] # Test logic here assert expected in results ``` ### Sample Data Files Keep sample files small: - **DICOM**: Single slice, low resolution - **NIfTI**: Small volume (e.g., 64×64×64) - **Images**: PNG/JPG under 1 MB ## Continuous Integration ### GitHub Actions Tests run automatically on: - Pull requests - Pushes to main ### CI Configuration ```yaml # .github/workflows/test.yml name: Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.10' - run: pip install -e ".[dev]" - run: pytest --cov=ai_agent ``` ## Best Practices ### Do's ✅ **Test edge cases**: Empty inputs, invalid data, etc. ✅ **Test error handling**: Verify exceptions are caught ✅ **Use descriptive names**: `test_retrieval_with_empty_query` not `test1` ✅ **Keep tests isolated**: Each test should be independent ✅ **Use fixtures**: Avoid repeating setup code ✅ **Mock expensive operations**: VLM calls, network requests ### Don'ts ❌ **Don't test implementation details**: Test behavior, not internal state ❌ **Don't make tests depend on each other**: Each should run independently ❌ **Don't commit large test files**: Keep test data small ❌ **Don't skip error checking**: Test both success and failure paths ## Performance Testing ### Benchmarking Use pytest-benchmark: ```python def test_retrieval_performance(benchmark): """Benchmark retrieval speed.""" index = VectorIndex() index.load("artifacts/rag_index") result = benchmark(index.search, "segment lungs", k=10) assert len(result) == 10 ``` ### Profiling ```bash # Profile tests pytest --profile # Generate SVG profile pytest --profile-svg ``` ## Debugging Tests ### Running in Debug Mode ```python # Add to test import pdb; pdb.set_trace() # Run pytest pytest tests/test_file.py ``` ### Verbose Output ```bash # Show print statements pytest -s # Very verbose pytest -vv # Show local variables on failure pytest -l ``` ### Running Single Test ```bash # Run one test function pytest tests/test_file.py::test_function_name -v ``` ## Next Steps - Review [Project Structure](structure.md) - Read [Contributing Guide](contributing.md) - Explore [Architecture](../architecture/overview.md)