Add multi-adapter registry with CompositeAdapter and Together AI adapter 82035cc 3v324v23 Claude Opus 4.6 commited on Feb 9
Fix #149: MCP protocol server — FastMCP over stdio for agent access f4026ca 3v324v23 Claude Opus 4.6 commited on Feb 8
Fix #148: Code review fixes for ADR-013 implementation 431cfcc 3v324v23 Claude Opus 4.6 commited on Feb 8
Fix #148: Semantic-chunker MCP primitives (geometry, trajectory) 94b5a0c 3v324v23 Claude Opus 4.6 commited on Feb 8
Fix #145: Replace ReactRunner with execute_test_case() dispatch 25b2fb7 3v324v23 Claude Opus 4.6 commited on Feb 8
Fix #144: Extract stateless react_step() MCP primitive from react_execute() 0a70295 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #143: Add ReactRunner orchestration and handler routing 0126e0a 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #142: Extend BenchmarkCase with mode, mock_tools, max_iterations f8dac5d 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #141: Implement react_execute() MCP tool for ReAct loop evaluation 8722590 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #138: Cache drift availability check and add sibling repo fallback 857e48b 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #137: Port cycle detection and ReAct schemas from LAS 05a6bc3 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #136: Surface structured tool calls via __TOOL_CALLS__ sentinel 700c0b3 3v324v23 Claude Opus 4.6 commited on Feb 7
Fix #135: Support pre-defined multi-turn conversation history in BenchmarkCase eb60610 3v324v23 Claude Opus 4.6 commited on Feb 7
Pipelined judge scheduling (#131) and embedding-based drift validation (#132) 3997332 3v324v23 Claude Opus 4.6 commited on Feb 7
Prevent JIT model swap mid-stream on multi-GPU setups c78c0ca 3v324v23 Claude Opus 4.6 commited on Feb 7
Document semantic validation and promptfoo handling d6dece8 3v324v23 Claude Opus 4.5 commited on Feb 1
Fix #123: Map expected_verdict and category from promptfoo vars c6c76c3 3v324v23 Claude Opus 4.5 commited on Jan 31
Fix #121: Remove queue timeout to prevent cascade failures 7cbd42b 3v324v23 Claude Opus 4.5 commited on Jan 31
Support promptfoo prompt objects (id, label, raw) 4b29c33 3v324v23 Claude Opus 4.5 commited on Jan 30
Fix #120: Judge dropdown ignores Only Loaded filter 2606078 3v324v23 Claude Opus 4.5 commited on Jan 28
Fix #119: Only Loaded filter returns all models when none loaded 11ffe3d 3v324v23 Claude Opus 4.5 commited on Jan 28
Fix #118: Return inference latency from adapter layer 92fac74 3v324v23 Claude Opus 4.5 commited on Jan 28
Fix #110: Update tests for server affinity removal 81342af 3v324v23 Claude Opus 4.5 commited on Jan 24
Fix #106: Update test_server_affinity to use InferenceTask 3e9d894 3v324v23 Claude Opus 4.5 commited on Jan 24
Fix #104: Remove battery semaphore to enable parallel GPU execution 92445e3 3v324v23 Claude Opus 4.5 commited on Jan 24
fix: Remove per-request manifest refresh serialization 07e54a6 3v324v23 Claude Opus 4.5 commited on Jan 24
fix: Remove adapter-level lock for parallel multi-GPU dispatch (#104) 232637b 3v324v23 Claude Opus 4.5 commited on Jan 24
feat: Wire judge() MCP primitive into BatteryRunner (#102) f91d1c9 3v324v23 Claude Opus 4.5 commited on Jan 24
test: Server affinity parsing + LMStudioAdapter boundary tests 233847b 3v324v23 Claude Opus 4.5 commited on Jan 23
refactor: Rename Test* classes to avoid pytest collection warnings e977556 3v324v23 Claude Opus 4.5 commited on Jan 23
feat: Complete multi-GPU parallel execution with server affinity 609b6be 3v324v23 Claude Opus 4.5 commited on Jan 23
wip: Multi-GPU parallel execution with server affinity c3c1305 3v324v23 Claude Opus 4.5 commited on Jan 23
test: ADR-006 test suite alignment - mock at layer boundaries 6010b71 3v324v23 Claude Opus 4.5 commited on Jan 23
refactor: ADR-006 Battery tab - MCP registry pattern b79ff2c 3v324v23 Claude Opus 4.5 commited on Jan 23
refactor: Remove Gemini CLI, update tests to mock MCP layer 884bd62 3v324v23 Claude Opus 4.5 commited on Jan 23
refactor: Adapter pattern - LMStudioAdapter owns httpx, ComparisonSession uses MCP f36c1bf 3v324v23 Claude Opus 4.5 commited on Jan 23
refactor(ui): UI Surgery - shared header, remove Stability tab cf52362 3v324v23 Claude Opus 4.5 commited on Jan 22
refactor: BatteryRunner owns pool directly, uses list_models MCP eef452b 3v324v23 Claude Opus 4.5 commited on Jan 20
feat(mcp): Add complete and judge primitives, rename dispatcher d68a3c8 3v324v23 Claude Opus 4.5 commited on Jan 20
feat(mcp): Add list_models MCP tool - first primitive 416626b 3v324v23 Claude Opus 4.5 commited on Jan 19