Spaces:
Sleeping
Sleeping
Development Guide
Overview
MediGuard AI is a medical biomarker analysis system that uses agentic RAG (Retrieval-Augmented Generation) and multi-agent workflows to provide clinical insights.
Project Structure
Agentic-RagBot/
βββ src/
β βββ agents/ # Agent implementations (biomarker_analyzer, disease_explainer, etc.)
β βββ services/ # Core services (retrieval, embeddings, opensearch, etc.)
β βββ routers/ # FastAPI route handlers
β βββ models/ # Data models
β βββ schemas/ # Pydantic schemas
β βββ state.py # State management
β βββ workflow.py # Workflow orchestration
β βββ main.py # FastAPI application factory
β βββ settings.py # Configuration management
βββ tests/ # Test suite
βββ data/ # Data files (vector stores, etc.)
βββ docs/ # Documentation
Development Setup
Install dependencies:
pip install -r requirements.txtEnvironment variables:
- Copy
.env.exampleto.envand configure - Key variables:
API__HOST: Server host (default: 127.0.0.1)API__PORT: Server port (default: 8000)GRADIO_SERVER_NAME: Gradio host (default: 127.0.0.1)GRADIO_PORT: Gradio port (default: 7860)
- Copy
Running the application:
# FastAPI server python -m src.main # Gradio interface python -m src.gradio_app
Code Quality
Linting
# Check code quality
ruff check src/
# Auto-fix issues
ruff check src/ --fix
Security
# Run security scan
bandit -r src/
Testing
# Run all tests
pytest tests/
# Run with coverage
pytest tests/ --cov=src --cov-report=term-missing
# Run specific test file
pytest tests/test_agents.py -v
Testing Guidelines
Test structure:
- Unit tests for individual components
- Integration tests for workflows
- Mock external dependencies (LLMs, databases)
Test coverage:
- Current coverage: 58%
- Target: 70%+
- Focus on critical paths and business logic
Best practices:
- Use descriptive test names
- Mock external services
- Test both success and failure cases
- Keep tests isolated and independent
Architecture
Multi-Agent Workflow
The system uses a multi-agent architecture with the following agents:
- BiomarkerAnalyzer: Validates and analyzes biomarker values
- DiseaseExplainer: Provides disease pathophysiology explanations
- BiomarkerLinker: Connects biomarkers to disease predictions
- ClinicalGuidelines: Provides evidence-based recommendations
- ConfidenceAssessor: Evaluates prediction reliability
- ResponseSynthesizer: Compiles final response
State Management
GuildState: Shared state between agentsPatientInput: Input data structureExplanationSOP: Standard operating procedures
Configuration
Settings are managed via Pydantic with environment variable support:
from src.settings import get_settings
settings = get_settings()
print(settings.api.host)
Deployment
Production Considerations
Security:
- Bind to specific interfaces (not 0.0.0.0)
- Use HTTPS in production
- Configure proper CORS origins
Performance:
- Use multiple workers
- Configure connection pooling
- Monitor memory usage
Monitoring:
- Enable health checks
- Configure logging
- Set up metrics collection
Contributing
- Fork the repository
- Create a feature branch
- Write tests for new functionality
- Ensure all tests pass
- Submit a pull request
Troubleshooting
Common Issues
Tests failing with import errors:
- Check PYTHONPATH includes project root
- Ensure all dependencies installed
Vector store errors:
- Check data/vector_stores directory exists
- Verify embedding model is accessible
LLM connection issues:
- Check Ollama is running
- Verify model is downloaded
Performance Optimization
- Caching: Redis for frequently accessed data
- Async: Use async/await for I/O operations
- Batching: Process multiple items when possible
- Lazy loading: Load resources only when needed
Security Best Practices
- Never commit secrets or API keys
- Use environment variables for configuration
- Validate all inputs
- Implement proper error handling
- Regular security scans with Bandit