| # Project Structure | |
| The AI Imaging Agent is organized into modular components with clear separation of concerns. | |
| ## Directory Layout | |
| ``` | |
| ai-agent/ | |
| βββ .github/ | |
| β βββ workflows/ # CI/CD workflows | |
| β βββ deploy_docs.yml # Documentation deployment | |
| βββ artifacts/ | |
| β βββ rag_index/ # FAISS index and embeddings | |
| βββ dataset/ | |
| β βββ catalog.jsonl # Software catalog | |
| βββ docs/ # MkDocs documentation | |
| βββ logs/ # Application logs | |
| βββ src/ | |
| β βββ ai_agent/ # Main package | |
| β βββ agent/ # PydanticAI agent | |
| β βββ api/ # Pipeline orchestration | |
| β βββ catalog/ # Catalog management | |
| β βββ generator/ # VLM selection (schemas) | |
| β βββ retriever/ # Text retrieval | |
| β βββ ui/ # Gradio interface | |
| β βββ utils/ # Shared utilities | |
| βββ tests/ # Test suite | |
| βββ config.yaml # Model configuration | |
| βββ mkdocs.yml # Documentation config | |
| βββ pyproject.toml # Package metadata | |
| βββ README.md # Project readme | |
| ``` | |
| ## Core Modules | |
| ### src/ai_agent/ | |
| Main package containing all application code. | |
| #### agent/ | |
| PydanticAI conversational agent implementation. | |
| ``` | |
| agent/ | |
| βββ __init__.py | |
| βββ agent.py # Agent definition, tool adapters | |
| βββ models.py # Agent output/log models | |
| βββ utils.py # Agent state and tool quota helpers | |
| βββ tools/ # Tool implementations (search, repo_info, mcp) | |
| ``` | |
| **Key components**: | |
| - `agent.py`: Agent instance, system prompt, tool definitions | |
| - `models.py`: Agent output and tool usage schemas | |
| - `utils.py`: `AgentState` plus call caps/prepare hooks | |
| - `tools/`: Tool implementations (search, alternatives, repo info, mcp tools) | |
| **Dependencies**: `api/`, `utils/` | |
| #### api/ | |
| Pipeline orchestration and core logic. | |
| ``` | |
| api/ | |
| βββ __init__.py | |
| βββ pipeline.py # RAGImagingPipeline main class | |
| ``` | |
| **Responsibilities**: | |
| - File validation and metadata extraction | |
| - Retrieval + VLM selection orchestration | |
| - Error handling and logging | |
| - Index management | |
| **Dependencies**: `retriever/`, `generator/`, `utils/` | |
| #### catalog/ | |
| Software catalog synchronization. | |
| ``` | |
| catalog/ | |
| βββ __init__.py | |
| βββ sync.py # Catalog sync logic | |
| ``` | |
| **Functions**: | |
| - Load catalog from JSONL | |
| - Check for changes (SHA1) | |
| - Trigger index rebuild | |
| **Dependencies**: `retriever/` | |
| #### generator/ | |
| VLM selection schemas and types. | |
| ``` | |
| generator/ | |
| βββ __init__.py | |
| βββ schema.py # Pydantic models for responses | |
| ``` | |
| **Models**: | |
| - `ToolRecommendation`: Individual tool recommendation | |
| - `AgentResponse`: Complete response with status | |
| - `ConversationStatus`: Enum for conversation states | |
| - `ToolReason`: Enum for recommendation reasons | |
| **Dependencies**: None (pure schemas) | |
| #### retriever/ | |
| Text-based retrieval pipeline. | |
| ``` | |
| retriever/ | |
| βββ __init__.py | |
| βββ text_embedder.py # BGE-M3 embedding model | |
| βββ vector_index.py # FAISS index management | |
| βββ reranker.py # CrossEncoder reranking | |
| βββ software_doc.py # Catalog schema and loading | |
| ``` | |
| **Pipeline flow**: | |
| 1. `text_embedder.py`: Embed query | |
| 2. `vector_index.py`: FAISS search | |
| 3. `reranker.py`: CrossEncoder reranking | |
| 4. Output: Top-K candidates | |
| **Dependencies**: None (pure retrieval) | |
| #### ui/ | |
| Gradio web interface. | |
| ``` | |
| ui/ | |
| βββ __init__.py | |
| βββ app.py # Gradio app definition | |
| βββ components.py # Reusable UI components | |
| βββ formatters.py # Response formatting | |
| βββ handlers.py # Message handlers | |
| βββ state.py # UI state management | |
| βββ visualizations.py # Preview and trace rendering | |
| ``` | |
| **Key files**: | |
| - `app.py`: Main Gradio interface | |
| - `handlers.py`: `respond()` function - core interaction logic | |
| - `formatters.py`: Format recommendations as markdown/cards | |
| - `components.py`: Reusable Gradio components | |
| **Dependencies**: `agent/`, `api/` | |
| #### utils/ | |
| Shared utilities. | |
| ``` | |
| utils/ | |
| βββ __init__.py | |
| βββ config.py # Configuration loading | |
| βββ file_validator.py # File validation | |
| βββ image_meta.py # Metadata extraction (DICOM, NIfTI, TIFF) | |
| βββ previews.py # Image preview generation | |
| βββ tags.py # Control tag parsing | |
| ``` | |
| **Common utilities**: | |
| - `config.py`: Load `config.yaml` with Pydantic validation | |
| - `file_validator.py`: Size limits, format checks | |
| - `image_meta.py`: Extract DICOM/NIfTI/TIFF metadata | |
| - `previews.py`: Convert medical images to PNG | |
| - `tags.py`: Parse exclusion tags and strip control tags from queries | |
| **Dependencies**: None (pure utilities) | |
| #### cli.py | |
| Command-line interface entry point. | |
| ```python | |
| def main(): | |
| # Parse arguments | |
| # Route to chat or sync | |
| ``` | |
| **Commands**: | |
| - `ai_agent chat`: Launch UI | |
| - `ai_agent sync`: Sync catalog | |
| ### tests/ | |
| Test suite. | |
| ``` | |
| tests/ | |
| βββ data/ | |
| β βββ test_data.json # Test cases | |
| βββ test_retrieval_pipeline.py | |
| βββ test_deepwiki_repo_info.py | |
| βββ ... | |
| ``` | |
| **Test categories**: | |
| - Unit tests: Individual components | |
| - Integration tests: Full pipeline | |
| - End-to-end tests: Real API calls (optional) | |
| ## Configuration Files | |
| ### pyproject.toml | |
| Python package metadata and dependencies. | |
| ```toml | |
| [project] | |
| name = "ai_agent" | |
| version = "1.0.0" | |
| dependencies = [...] | |
| [project.scripts] | |
| ai_agent = "ai_agent.cli:main" | |
| ``` | |
| ### config.yaml | |
| Model configuration. | |
| ```yaml | |
| agent_model: | |
| name: "gpt-4o-mini" | |
| base_url: null | |
| api_key_env: "OPENAI_API_KEY" | |
| available_models: | |
| - display_name: "gpt-4o-mini" | |
| name: "gpt-4o-mini" | |
| ... | |
| ``` | |
| ### mkdocs.yml | |
| Documentation configuration. | |
| ```yaml | |
| site_name: AI Imaging Agent | |
| theme: | |
| name: material | |
| nav: [...] | |
| ``` | |
| ### .env | |
| Environment variables (not committed). | |
| ```dotenv | |
| OPENAI_API_KEY=sk-xxxx | |
| SOFTWARE_CATALOG=dataset/catalog.jsonl | |
| ``` | |
| ## Data Files | |
| ### dataset/catalog.jsonl | |
| Software catalog in JSON Lines format. | |
| Each line is a complete JSON object following schema.org SoftwareSourceCode. | |
| ### artifacts/rag_index/ | |
| Pre-built FAISS index and metadata. | |
| ``` | |
| artifacts/rag_index/ | |
| βββ index.faiss # FAISS binary index | |
| βββ meta.json # Tool IDs, config, timestamps | |
| ``` | |
| ## Module Boundaries | |
| Clear separation prevents circular dependencies: | |
| ``` | |
| ui/ β agent/ β api/ β retriever/ | |
| β generator/ | |
| β utils/ | |
| ``` | |
| **Rules**: | |
| - `utils/`: No dependencies on other modules | |
| - `retriever/`: Pure retrieval, no generation | |
| - `generator/`: Pure schemas, no retrieval | |
| - `api/`: Orchestrates retriever + generator | |
| - `agent/`: Uses api for tool calls | |
| - `ui/`: Top-level, depends on agent + api | |
| ## Import Patterns | |
| All imports use absolute paths from `ai_agent`: | |
| ```python | |
| from ai_agent.retriever.vector_index import VectorIndex | |
| from ai_agent.utils.config import load_config | |
| from ai_agent.agent.utils import AgentState | |
| ``` | |
| **Never use** relative imports like `from ..utils import ...` | |
| ## Extension Points | |
| ### Adding New Tools | |
| Add tool adapters to `agent/agent.py` and implement logic in `agent/tools/`: | |
| ```python | |
| @agent.tool | |
| async def new_tool(ctx: RunContext[AgentState], param: str) -> str: | |
| """Tool description.""" | |
| # Implementation | |
| return result | |
| ``` | |
| ### Adding New Metadata Extractors | |
| Add to `utils/image_meta.py`: | |
| ```python | |
| def extract_custom_format(file_path: str) -> dict: | |
| """Extract metadata from custom format.""" | |
| # Implementation | |
| return metadata | |
| ``` | |
| ### Adding New Retrieval Models | |
| Replace in `retriever/text_embedder.py`: | |
| ```python | |
| class TextEmbedder: | |
| def __init__(self, model_name="new-embedding-model"): | |
| self.model = SentenceTransformer(model_name) | |
| ``` | |
| ## Next Steps | |
| - Learn about [Contributing](contributing.md) | |
| - Explore [Testing](testing.md) | |
| - Return to [Architecture Overview](../architecture/overview.md) | |