| # P0 - Systemic Provider Mismatch Across All Modes | |
| **Status:** RESOLVED | |
| **Priority:** P0 (Blocker for Free Tier/Demo) | |
| **Found:** 2025-11-30 (during Audit) | |
| **Resolved:** 2025-11-30 | |
| **Component:** Multiple files across orchestrators, agents, services | |
| ## Resolution Summary | |
| The critical provider mismatch bug has been fixed by implementing auto-detection in `src/agent_factory/judges.py`. | |
| The `get_model()` function now checks for actual API key availability (`has_openai_key`, `has_anthropic_key`, `has_huggingface_key`) | |
| instead of relying on the static `settings.llm_provider` configuration. | |
| ### Fix Details | |
| - **Auto-Detection Implemented**: `get_model()` prioritizes OpenAI > Anthropic > HuggingFace based on *available keys*. | |
| - **Fail-Fast on No Keys**: If no API keys are configured, `get_model()` raises `ConfigurationError` with clear message. | |
| - **HuggingFace Requires Token**: Free Tier via `HuggingFaceModel` requires `HF_TOKEN` (PydanticAI requirement). | |
| - **Synthesis Fallback**: When `get_model()` fails, synthesis gracefully falls back to template. | |
| - **Audit Fixes Applied**: | |
| - Replaced manual `os.getenv` checks with centralized `settings` properties in `src/app.py`. | |
| - Added logging to `src/services/statistical_analyzer.py` (fixed silent `pass`). | |
| - Narrowed exception handling in `src/tools/pubmed.py`. | |
| - Optimized string search in `src/tools/code_execution.py`. | |
| ### Key Clarification | |
| The **Free Tier** in Simple Mode uses `HFInferenceJudgeHandler` (which uses `huggingface_hub.InferenceClient`) | |
| for judging - this does NOT require `HF_TOKEN`. However, synthesis via `get_model()` uses PydanticAI's | |
| `HuggingFaceModel` which DOES require `HF_TOKEN`. When no tokens are configured, synthesis falls back to | |
| the template-based summary (which is still useful). | |
| ### Verification | |
| - **Unit Tests**: 5 new TDD tests in `tests/unit/agent_factory/test_get_model_auto_detect.py` pass. | |
| - **All Tests**: 309 tests pass (`make check` succeeds). | |
| - **Regression Tests**: Fixed and verified `tests/unit/agent_factory/test_judges_factory.py`. | |
| --- | |
| ## Symptom (Archive) | |
| When running in "Simple Mode" (Free Tier / No API Key), the synthesis step fails to generate a narrative and falls back to a structured summary template. The user sees: | |
| ```text | |
| > β οΈ Note: AI narrative synthesis unavailable. Showing structured summary. | |
| > _Error: OpenAIError_ | |
| ``` | |
| ## Affected Files (COMPREHENSIVE AUDIT) | |
| ### Files Calling `get_model()` Directly (9 locations) | |
| | File | Line | Context | Impact | | |
| |------|------|---------|--------| | |
| | `simple.py` | 547 | Synthesis step | Free Tier broken | | |
| | `statistical_analyzer.py` | 75 | Analysis agent | Free Tier broken | | |
| | `judge_agent_llm.py` | 18 | LLM Judge | Free Tier broken | | |
| | `graph/nodes.py` | 177 | LangGraph hypothesis | Free Tier broken | | |
| | `graph/nodes.py` | 249 | LangGraph synthesis | Free Tier broken | | |
| | `report_agent.py` | 45 | Report generation | Free Tier broken | | |
| | `hypothesis_agent.py` | 44 | Hypothesis generation | Free Tier broken | | |
| | `judges.py` | 100 | JudgeHandler default | OK (accepts param) | | |
| ### Files Hardcoding `OpenAIChatClient` (Architecturally OpenAI-Only) | |
| | File | Lines | Context | | |
| |------|-------|---------| | |
| | `advanced.py` | 100, 121 | Manager client | | |
| | `magentic_agents.py` | 29, 70, 129, 173 | All 4 agents | | |
| | `retrieval_agent.py` | 62 | Retrieval agent | | |
| | `code_executor_agent.py` | 52 | Code executor | | |
| | `llm_factory.py` | 42 | Factory default | | |
| **Note:** Advanced mode is architecturally locked to OpenAI via `agent_framework.openai.OpenAIChatClient`. This is by design - see `app.py:188-194` which falls back to Simple mode if no OpenAI key. However, users are not clearly informed of this limitation. | |
| ## Root Cause | |
| **Settings/Runtime Sync Gap - Two Separate Backend Selection Systems.** | |
| The codebase has **two independent** systems for selecting the LLM backend: | |
| 1. `settings.llm_provider` (config.py default: "openai") | |
| 2. `app.py` runtime detection via `os.getenv()` checks | |
| These are **never synchronized**, causing the Judge and Synthesis steps to use different backends. | |
| ### Detailed Call Chain | |
| 1. **`src/app.py:115-126`** (runtime detection): | |
| ```python | |
| # app.py bypasses settings entirely for JudgeHandler selection | |
| elif os.getenv("OPENAI_API_KEY"): | |
| judge_handler = JudgeHandler(model=None, domain=domain) | |
| elif os.getenv("ANTHROPIC_API_KEY"): | |
| judge_handler = JudgeHandler(model=None, domain=domain) | |
| else: | |
| judge_handler = HFInferenceJudgeHandler(domain=domain) # Free Tier | |
| ``` | |
| **Note:** This creates the correct handler but does NOT update `settings.llm_provider`. | |
| 2. **`src/orchestrators/simple.py:546-552`** (synthesis step): | |
| ```python | |
| from src.agent_factory.judges import get_model | |
| agent: Agent[None, str] = Agent(model=get_model(), ...) # <-- BUG! | |
| ``` | |
| Synthesis calls `get_model()` directly instead of using the injected judge's model. | |
| 3. **`src/agent_factory/judges.py:56-78`** (`get_model()`): | |
| ```python | |
| def get_model() -> Any: | |
| llm_provider = settings.llm_provider # <-- Reads from settings (still "openai") | |
| # ... | |
| openai_provider = OpenAIProvider(api_key=settings.openai_api_key) # <-- None! | |
| return OpenAIChatModel(settings.openai_model, provider=openai_provider) | |
| ``` | |
| **Result:** Creates OpenAI model with `api_key=None` β `OpenAIError` | |
| ### Why Free Tier Fails | |
| | Step | System Used | Backend Selected | | |
| |------|-------------|------------------| | |
| | JudgeHandler | `app.py` runtime | HFInferenceJudgeHandler β | | |
| | Synthesis | `settings.llm_provider` | OpenAI (default) β | | |
| The Judge works because app.py explicitly creates `HFInferenceJudgeHandler`. | |
| Synthesis fails because it calls `get_model()` which reads `settings.llm_provider = "openai"` (unchanged from default). | |
| ## Impact | |
| - **User Experience:** Free tier users (Demo users) never see the high-quality narrative synthesis, only the fallback. | |
| - **System Integrity:** The orchestrator ignores the runtime backend selection. | |
| ## Implemented Fix | |
| **Strategy: Fix `get_model()` to Auto-Detect Available Provider** | |
| ### Actual Implementation (Merged) | |
| **File:** `src/agent_factory/judges.py` | |
| This is the **single point of fix** that resolves all 7 broken `get_model()` call sites. | |
| ```python | |
| def get_model() -> Any: | |
| """Get the LLM model based on available API keys. | |
| Priority order: | |
| 1. OpenAI (if OPENAI_API_KEY set) | |
| 2. Anthropic (if ANTHROPIC_API_KEY set) | |
| 3. HuggingFace (if HF_TOKEN set) | |
| Raises: | |
| ConfigurationError: If no API keys are configured. | |
| Note: settings.llm_provider is ignored in favor of actual key availability. | |
| This ensures the model matches what app.py selected for JudgeHandler. | |
| """ | |
| from src.utils.exceptions import ConfigurationError | |
| # Priority 1: OpenAI (most common, best tool calling) | |
| if settings.has_openai_key: | |
| openai_provider = OpenAIProvider(api_key=settings.openai_api_key) | |
| return OpenAIChatModel(settings.openai_model, provider=openai_provider) | |
| # Priority 2: Anthropic | |
| if settings.has_anthropic_key: | |
| provider = AnthropicProvider(api_key=settings.anthropic_api_key) | |
| return AnthropicModel(settings.anthropic_model, provider=provider) | |
| # Priority 3: HuggingFace (requires HF_TOKEN) | |
| if settings.has_huggingface_key: | |
| model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct" | |
| hf_provider = HuggingFaceProvider(api_key=settings.hf_token) | |
| return HuggingFaceModel(model_name, provider=hf_provider) | |
| # No keys configured - fail fast with clear error | |
| raise ConfigurationError( | |
| "No LLM API key configured. Set one of: OPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN" | |
| ) | |
| ``` | |
| **Why this works:** | |
| - Single fix location updates all 7 broken call sites | |
| - Matches app.py's detection logic (key availability, not settings.llm_provider) | |
| - HuggingFace works when HF_TOKEN is available | |
| - Raises clear error when no keys configured (callers can catch and fallback) | |
| - No changes needed to orchestrators, agents, or services | |
| ### What This Does NOT Fix (By Design) | |
| **Advanced Mode remains OpenAI-only.** The following files use `agent_framework.openai.OpenAIChatClient` which only supports OpenAI: | |
| - `advanced.py` (Manager + agents) | |
| - `magentic_agents.py` (SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent) | |
| - `retrieval_agent.py`, `code_executor_agent.py` | |
| This is **by design** - the Microsoft Agent Framework library (`agent-framework-core`) only provides `OpenAIChatClient`. To support other providers in Advanced mode would require: | |
| 1. Wait for `agent-framework` to add Anthropic/HuggingFace clients, OR | |
| 2. Write our own `ChatClient` implementations (significant effort) | |
| **The current app.py behavior is correct:** it falls back to Simple mode when no OpenAI key is present (lines 188-194). The UI message could be clearer about why. | |
| ## Test Plan (Implemented) | |
| ### Unit Tests (Verified Passing) | |
| ```python | |
| # tests/unit/agent_factory/test_get_model_auto_detect.py | |
| import pytest | |
| from src.agent_factory.judges import get_model | |
| from src.utils.config import settings | |
| from src.utils.exceptions import ConfigurationError | |
| class TestGetModelAutoDetect: | |
| """Test that get_model() auto-detects available providers.""" | |
| def test_returns_openai_when_key_present(self, monkeypatch): | |
| """OpenAI key present β OpenAI model.""" | |
| monkeypatch.setattr(settings, "openai_api_key", "sk-test") | |
| monkeypatch.setattr(settings, "anthropic_api_key", None) | |
| monkeypatch.setattr(settings, "hf_token", None) | |
| model = get_model() | |
| assert isinstance(model, OpenAIChatModel) | |
| def test_returns_anthropic_when_only_anthropic_key(self, monkeypatch): | |
| """Only Anthropic key β Anthropic model.""" | |
| monkeypatch.setattr(settings, "openai_api_key", None) | |
| monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test") | |
| monkeypatch.setattr(settings, "hf_token", None) | |
| model = get_model() | |
| assert isinstance(model, AnthropicModel) | |
| def test_returns_huggingface_when_hf_token_present(self, monkeypatch): | |
| """HF_TOKEN present (no paid keys) β HuggingFace model.""" | |
| monkeypatch.setattr(settings, "openai_api_key", None) | |
| monkeypatch.setattr(settings, "anthropic_api_key", None) | |
| monkeypatch.setattr(settings, "hf_token", "hf_test_token") | |
| model = get_model() | |
| assert isinstance(model, HuggingFaceModel) | |
| def test_raises_error_when_no_keys(self, monkeypatch): | |
| """No keys at all β ConfigurationError.""" | |
| monkeypatch.setattr(settings, "openai_api_key", None) | |
| monkeypatch.setattr(settings, "anthropic_api_key", None) | |
| monkeypatch.setattr(settings, "hf_token", None) | |
| with pytest.raises(ConfigurationError) as exc_info: | |
| get_model() | |
| assert "No LLM API key configured" in str(exc_info.value) | |
| def test_openai_takes_priority_over_anthropic(self, monkeypatch): | |
| """Both keys present β OpenAI wins.""" | |
| monkeypatch.setattr(settings, "openai_api_key", "sk-test") | |
| monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test") | |
| model = get_model() | |
| assert isinstance(model, OpenAIChatModel) | |
| ``` | |
| ### Full Test Suite | |
| ```bash | |
| $ make check | |
| # 309 passed in 238.16s (0:03:58) | |
| # All checks passed! | |
| ``` | |
| ### Manual Verification | |
| 1. **Unset all API keys**: `unset OPENAI_API_KEY ANTHROPIC_API_KEY HF_TOKEN` | |
| 2. **Run app**: `uv run python -m src.app` | |
| 3. **Submit query**: "What drugs improve female libido?" | |
| 4. **Verify**: Synthesis falls back to template (shows `ConfigurationError` in logs, but user sees structured summary) | |