SPEC_16: Unified Chat Client Architecture
Status: Proposed Priority: P1 (Architectural Simplification) Issue: Updates #105, #109 Created: 2025-12-01 Last Verified: 2025-12-01 (line counts and imports verified against codebase)
Summary
Eliminate the Simple Mode / Advanced Mode parallel universe by implementing a pluggable ChatClient architecture. This moves the system away from a hardcoded OpenAIChatClient namespace to a neutral BaseChatClient protocol, allowing the multi-agent framework to work with ANY LLM provider through a unified codebase.
Strategic Goals
- Namespace Neutrality: Decouple the core orchestrator from the
OpenAInamespace. The system should speakChatClient, notOpenAIChatClient. - Full-Stack Provider Chain: Prioritize providers that offer both LLM and Embeddings (OpenAI, Gemini, HuggingFace+Local) to ensure a unified environment.
- Fragmentation Reduction: Remove "LLM-only" providers (Anthropic) that force complex hybrid dependency chains (e.g., Anthropic LLM + OpenAI Embeddings).
Problem Statement
Current Architecture: Two Parallel Universes
User Query
β
βββ Has API Key? ββYesβββ Advanced Mode (488 lines)
β βββ Microsoft Agent Framework
β βββ OpenAIChatClient (hardcoded dependency)
β
βββ No API Key? βββββββββββ Simple Mode (778 lines)
βββ While-loop orchestration
βββ Pydantic AI + HuggingFace
Problems:
- Double Maintenance: 1,266 lines across two orchestrator systems.
- Namespace Lock-in: The Advanced Orchestrator is tightly coupled to
OpenAIChatClient(25 references across 5 files). - Fragmented Chains: Using Anthropic requires a "Frankenstein" chain (Anthropic LLM + OpenAI Embeddings).
- Testing Burden: Two test suites, two CI paths.
Proposed Solution: ChatClientFactory
Architecture After Implementation
User Query
β
ββββ Advanced Mode (unified)
βββ Microsoft Agent Framework
βββ ChatClientFactory (Namespace Neutral):
βββ OpenAIChatClient (Paid Tier: Best Performance)
βββ GeminiChatClient (Alternative Tier: LLM + Embeddings)
βββ HuggingFaceChatClient (Free Tier: LLM + Local Embeddings)
New Files
src/
βββ clients/
β βββ __init__.py
β βββ base.py # Re-export BaseChatClient (The neutral protocol)
β βββ factory.py # ChatClientFactory
β βββ huggingface.py # HuggingFaceChatClient
β βββ gemini.py # GeminiChatClient [Future]
ChatClientFactory Implementation
# src/clients/factory.py
from agent_framework import BaseChatClient
from agent_framework.openai import OpenAIChatClient
from src.utils.config import settings
def get_chat_client(
provider: str | None = None,
api_key: str | None = None,
) -> BaseChatClient:
"""
Factory for creating chat clients.
Auto-detection priority:
1. Explicit provider parameter
2. OpenAI key (Best Function Calling)
3. Gemini key (Best Context/Cost)
4. HuggingFace (Free Fallback)
Args:
provider: Force specific provider ("openai", "gemini", "huggingface")
api_key: Override API key for the provider
Returns:
Configured BaseChatClient instance (Neutral Namespace)
"""
# OpenAI (Standard)
if provider == "openai" or (provider is None and settings.has_openai_key):
return OpenAIChatClient(
model_id=settings.openai_model,
api_key=api_key or settings.openai_api_key,
)
# Gemini (High Performance Alternative) - REQUIRES config.py update first
if provider == "gemini" or (provider is None and settings.has_gemini_key):
from src.clients.gemini import GeminiChatClient
return GeminiChatClient(
model_id="gemini-2.0-flash",
api_key=api_key or settings.gemini_api_key,
)
# Free Fallback (HuggingFace)
from src.clients.huggingface import HuggingFaceChatClient
return HuggingFaceChatClient(
model_id="meta-llama/Llama-3.1-70B-Instruct",
)
Changes to Advanced Orchestrator
# src/orchestrators/advanced.py
# BEFORE (hardcoded namespace):
from agent_framework.openai import OpenAIChatClient
class AdvancedOrchestrator:
def __init__(self, ...):
self._chat_client = OpenAIChatClient(...)
# AFTER (neutral namespace):
from src.clients.factory import get_chat_client
class AdvancedOrchestrator:
def __init__(self, chat_client=None, provider=None, api_key=None, ...):
# The orchestrator no longer knows about OpenAI
self._chat_client = chat_client or get_chat_client(
provider=provider,
api_key=api_key,
)
Technical Requirements
BaseChatClient Protocol (Verified)
The agent_framework.BaseChatClient requires implementing 2 abstract methods:
class HuggingFaceChatClient(BaseChatClient):
"""Adapter for HuggingFace Inference API."""
async def _inner_get_response(
self,
messages: list[ChatMessage],
**kwargs
) -> ChatResponse:
"""Synchronous response generation."""
...
async def _inner_get_streaming_response(
self,
messages: list[ChatMessage],
**kwargs
) -> AsyncIterator[ChatResponseUpdate]:
"""Streaming response generation."""
...
Required Config Changes
BEFORE implementation, add to src/utils/config.py:
# Settings class additions:
gemini_api_key: str | None = Field(default=None, description="Google Gemini API key")
@property
def has_gemini_key(self) -> bool:
"""Check if Gemini API key is available."""
return bool(self.gemini_api_key)
Files to Modify (Complete List)
Category 1: OpenAIChatClient References (25 total)
| File | Lines | Changes Required |
|---|---|---|
src/orchestrators/advanced.py |
31, 70, 95, 101, 122 | Replace with get_chat_client() |
src/agents/magentic_agents.py |
4, 17, 29, 58, 70, 117, 129, 161, 173 | Change type hints to BaseChatClient |
src/agents/retrieval_agent.py |
5, 53, 62 | Change type hints to BaseChatClient |
src/agents/code_executor_agent.py |
7, 43, 52 | Change type hints to BaseChatClient |
src/utils/llm_factory.py |
19, 22, 35, 38, 42 | Merge into clients/factory.py |
Category 2: Anthropic References (46 total - Issue #110)
| File | Refs | Changes Required |
|---|---|---|
src/agent_factory/judges.py |
10 | Remove Anthropic imports and fallback |
src/utils/config.py |
10 | Remove anthropic_api_key, anthropic_model, has_anthropic_key |
src/utils/llm_factory.py |
10 | Remove Anthropic model creation |
src/app.py |
12 | Remove Anthropic key detection and UI |
src/orchestrators/simple.py |
2 | Remove Anthropic mentions |
src/agents/hypothesis_agent.py |
1 | Update comment |
Category 3: Files to Delete (Phase 3)
| File | Lines | Reason |
|---|---|---|
src/orchestrators/simple.py |
778 | Replaced by unified Advanced Mode |
src/tools/search_handler.py |
219 | Manager agent handles orchestration |
Total deletion: ~997 lines Total addition: ~400 lines (new clients) Net: ~600 fewer lines, single architecture
Migration Plan
Phase 1: Neutralize Namespace & Add HuggingFace
- Add
gemini_api_keyandhas_gemini_keytosrc/utils/config.py - Create
src/clients/package - Implement
HuggingFaceChatClientadapter (~150 lines) - Implement
ChatClientFactory(~50 lines) - Refactor
AdvancedOrchestratorto useget_chat_client() - Update type hints in
magentic_agents.py,retrieval_agent.py,code_executor_agent.py - Merge
llm_factory.pyfunctionality intoclients/factory.py
Phase 2: Simplify Provider Chain (Issue #110)
- Remove Anthropic from
judges.py(10 refs) - Remove Anthropic from
config.py(10 refs) - Remove Anthropic from
llm_factory.py(10 refs) - Remove Anthropic from
app.py(12 refs) - Update user-facing strings mentioning Anthropic
- (Future) Implement
GeminiChatClient(~200 lines)
Phase 3: Deprecate Simple Mode (Issue #105)
- Update
src/orchestrators/factory.pyto use unifiedAdvancedOrchestrator - Delete
src/orchestrators/simple.py(778 lines) - Delete
src/tools/search_handler.py(219 lines) - Update tests to only test Advanced Mode
- Archive deleted files to
docs/archive/for reference
Why This is "Elegant"
- One System: We stop maintaining two parallel universes.
- Dependency Injection: The specific LLM provider is injected, not hardcoded.
- Full-Stack Alignment: We prioritize providers (OpenAI, Gemini) that own the whole vertical (LLM + Embeddings), reducing environment complexity.
Verification Checklist (For Implementer)
Before starting implementation, verify:
-
agent_framework.BaseChatClientexists (verified:agent_framework._clients.BaseChatClient) - Abstract methods:
_inner_get_response,_inner_get_streaming_response -
agent_framework.ChatResponse,ChatResponseUpdate,ChatMessageimportable -
settings.has_openai_keyexists (line 118) -
settings.has_gemini_keyMUST BE ADDED (does not exist) -
settings.gemini_api_keyMUST BE ADDED (does not exist)
References
- Microsoft Agent Framework:
agent_framework.BaseChatClient - Gemini API: Embeddings + LLM
- HuggingFace Inference:
huggingface_hub.InferenceClient - Issue #105: Deprecate Simple Mode
- Issue #109: Simplify Provider Architecture
- Issue #110: Remove Anthropic Provider Support