Agent LLM Integration — Real Inference via Adapters
What Changed
All reasoning agents in Codette now use real LLM inference via trained LoRA adapters instead of template substitution.
Before
# Template-based (generic)
def analyze(self, concept: str) -> str:
template = self.select_template(concept)
return template.replace("{concept}", concept)
Problem: Agents generated the same generic text for ANY concept, just with the concept name substituted. This produced non-specific, often contradictory reasoning that actually reduced correctness in debate.
After
# LLM-based (specific)
def analyze(self, concept: str) -> str:
if self.orchestrator and self.adapter_name:
# Call LLM with this agent's specific adapter
return self._analyze_with_llm(concept)
# Fallback to templates if LLM unavailable
return self._analyze_with_template(concept)
Benefit: Agents now reason using the actual concept content, generating domain-specific insights that strengthen debate quality.
Files Modified
Core Agent Files
reasoning_forge/agents/base_agent.py- Added
orchestratorparameter to__init__ - Implemented
_analyze_with_llm()for real inference - Kept
_analyze_with_template()as fallback analyze()now tries LLM first, falls back to templates
- Added
All agent subclasses: Added
adapter_nameattributenewton_agent.py:adapter_name = "newton"quantum_agent.py:adapter_name = "quantum"davinci_agent.py:adapter_name = "davinci"philosophy_agent.py:adapter_name = "philosophy"empathy_agent.py:adapter_name = "empathy"ethics_agent.py:adapter_name = "philosophy"(shared)critic_agent.py:adapter_name = "multi_perspective"+ newevaluate_ensemble_with_llm()method
Orchestrator Integration
reasoning_forge/forge_engine.py- Added
orchestratorparameter to__init__ - Lazy-loads
CodetteOrchestratorif not provided - Passes orchestrator to all agent constructors
- Graceful fallback to template mode if LLM unavailable
- Added
How It Works
Startup Flow
ForgeEngine.__init__()
→ Lazy-load CodetteOrchestrator (first call ~60s)
→ Instantiate agents with orchestrator
→ forge_with_debate(query)
→ For each agent: agent.analyze(concept)
→ If orchestrator available: Call LLM with adapter
→ Else: Use templates (backward compatible)
LLM Inference Flow
agent.analyze(concept)
1. Check: do we have orchestrator + adapter_name?
2. If yes: orchestrator.generate(
query=concept,
adapter_name="newton", # Newton-specific reasoning
system_prompt=template, # Guides the reasoning
enable_tools=False
)
3. If no: Fall back to template substitution
4. Return domain-specific analysis
Adapter Mapping
| Agent | Adapter | Purpose |
|---|---|---|
| Newton | newton |
Physics, mathematics, causal reasoning |
| Quantum | quantum |
Probabilistic, uncertainty, superposition |
| DaVinci | davinci |
Creative invention, cross-domain synthesis |
| Philosophy | philosophy |
Epistemology, ontology, conceptual foundations |
| Empathy | empathy |
Emotional intelligence, human impact |
| Ethics | philosophy |
Moral reasoning, consequences (shared adapter) |
| Critic | multi_perspective |
Meta-evaluation, ensemble critique |
Testing
Run the integration test:
python test_agent_llm_integration.py
This verifies:
- ForgeEngine loads with orchestrator
- Agents receive orchestrator instance
- Single agent generates real LLM response
- Multi-agent ensemble works
- Debate mode produces coherent synthesis
Performance Impact
- First debate: ~60s (orchestrator initialization)
- Subsequent debates: ~30-60s (LLM inference time)
- Agent initialization: <1ms (orchestrator already loaded)
Backward Compatibility
If the LLM/orchestrator is unavailable:
- ForgeEngine logs a warning
- Agents automatically fall back to templates
- System continues to work (with lower quality)
This allows:
- Testing without the LLM loaded
- Fast template-based iteration
- Graceful degradation
Expected Quality Improvements
With real LLM-based agents:
- Correctness: Should increase (domain-specific reasoning)
- Depth: Should increase (richer debate fuel)
- Synthesis: Should improve (agents actually understand concepts)
- Contradictions: Should decrease (coherent reasoning per adapter)
Next Steps
- Run
test_agent_llm_integration.pyto verify setup - Run evaluation:
python evaluation/run_evaluation_sprint.py --questions 5 - Compare results to previous template-based baseline
- Iterate on Phase 6 control mechanisms with real agents
Files Available
- Test:
test_agent_llm_integration.py— Integration validation - Models:
- Base:
bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf - Adapters:
adapters/*.gguf(8 LoRA adapters, ~27 MB each) - Alternative:
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/llama-3.2-1b-instruct-q8_0.gguf
- Base: