prompt-prix / tests

Commit History

Add multi-adapter registry with CompositeAdapter and Together AI adapter
82035cc

3v324v23 Claude Opus 4.6 commited on

Fix #149: MCP protocol server — FastMCP over stdio for agent access
f4026ca

3v324v23 Claude Opus 4.6 commited on

Fix #148: Code review fixes for ADR-013 implementation
431cfcc

3v324v23 Claude Opus 4.6 commited on

Fix #148: Semantic-chunker MCP primitives (geometry, trajectory)
94b5a0c

3v324v23 Claude Opus 4.6 commited on

Fix #145: Replace ReactRunner with execute_test_case() dispatch
25b2fb7

3v324v23 Claude Opus 4.6 commited on

Fix #144: Extract stateless react_step() MCP primitive from react_execute()
0a70295

3v324v23 Claude Opus 4.6 commited on

Fix #143: Add ReactRunner orchestration and handler routing
0126e0a

3v324v23 Claude Opus 4.6 commited on

Fix #142: Extend BenchmarkCase with mode, mock_tools, max_iterations
f8dac5d

3v324v23 Claude Opus 4.6 commited on

Fix #141: Implement react_execute() MCP tool for ReAct loop evaluation
8722590

3v324v23 Claude Opus 4.6 commited on

Fix #138: Cache drift availability check and add sibling repo fallback
857e48b

3v324v23 Claude Opus 4.6 commited on

Fix #137: Port cycle detection and ReAct schemas from LAS
05a6bc3

3v324v23 Claude Opus 4.6 commited on

Fix #136: Surface structured tool calls via __TOOL_CALLS__ sentinel
700c0b3

3v324v23 Claude Opus 4.6 commited on

Fix #135: Support pre-defined multi-turn conversation history in BenchmarkCase
eb60610

3v324v23 Claude Opus 4.6 commited on

Pipelined judge scheduling (#131) and embedding-based drift validation (#132)
3997332

3v324v23 Claude Opus 4.6 commited on

Prevent JIT model swap mid-stream on multi-GPU setups
c78c0ca

3v324v23 Claude Opus 4.6 commited on

Allow concurrent requests per server (#129)
2311766

3v324v23 Claude Opus 4.6 commited on

Add HuggingFace adapter and Spaces deployment
d22fc48

3v324v23 Claude Opus 4.5 commited on

Document semantic validation and promptfoo handling
d6dece8

3v324v23 Claude Opus 4.5 commited on

Add verdict validation to semantic validator
24d06cd

3v324v23 Claude Opus 4.5 commited on

Fix #123: Map expected_verdict and category from promptfoo vars
c6c76c3

3v324v23 Claude Opus 4.5 commited on

Fix #121: Remove queue timeout to prevent cascade failures
7cbd42b

3v324v23 Claude Opus 4.5 commited on

Support promptfoo prompt objects (id, label, raw)
4b29c33

3v324v23 Claude Opus 4.5 commited on

Fix #120: Judge dropdown ignores Only Loaded filter
2606078

3v324v23 Claude Opus 4.5 commited on

Fix #119: Only Loaded filter returns all models when none loaded
11ffe3d

3v324v23 Claude Opus 4.5 commited on

Fix #118: Return inference latency from adapter layer
92fac74

3v324v23 Claude Opus 4.5 commited on

Fix #110: Update tests for server affinity removal
81342af

3v324v23 Claude Opus 4.5 commited on

Fix #110: Remove server affinity prefix system
1742886

3v324v23 Claude Opus 4.5 commited on

Fix #106: Update test_server_affinity to use InferenceTask
3e9d894

3v324v23 Claude Opus 4.5 commited on

Add focused GPU1 routing tests
f80d685

3v324v23 Claude Opus 4.5 commited on

Fix #104: Remove battery semaphore to enable parallel GPU execution
92445e3

3v324v23 Claude Opus 4.5 commited on

fix: Remove per-request manifest refresh serialization
07e54a6

3v324v23 Claude Opus 4.5 commited on

Fix dispatcher race condition and refactor for type safety
7e92be5

3v324v23 commited on

fix: Remove adapter-level lock for parallel multi-GPU dispatch (#104)
232637b

3v324v23 Claude Opus 4.5 commited on

feat: Add PromptfooLoader for YAML test files
9c6a747

3v324v23 Claude Opus 4.5 commited on

feat: Wire judge() MCP primitive into BatteryRunner (#102)
f91d1c9

3v324v23 Claude Opus 4.5 commited on

test: Server affinity parsing + LMStudioAdapter boundary tests
233847b

3v324v23 Claude Opus 4.5 commited on

refactor: Rename Test* classes to avoid pytest collection warnings
e977556

3v324v23 Claude Opus 4.5 commited on

feat: Complete multi-GPU parallel execution with server affinity
609b6be

3v324v23 Claude Opus 4.5 commited on

wip: Multi-GPU parallel execution with server affinity
c3c1305

3v324v23 Claude Opus 4.5 commited on

fix: Battery loop order + adapter race condition
903dbc7

3v324v23 Claude Opus 4.5 commited on

feat: Column-per-GPU model selector
faea261

3v324v23 Claude Opus 4.5 commited on

test: ADR-006 test suite alignment - mock at layer boundaries
6010b71

3v324v23 Claude Opus 4.5 commited on

refactor: ADR-006 Battery tab - MCP registry pattern
b79ff2c

3v324v23 Claude Opus 4.5 commited on

refactor: Remove Gemini CLI, update tests to mock MCP layer
884bd62

3v324v23 Claude Opus 4.5 commited on

chore: Remove obsolete Gemini/FARA adapter code
f9f70a0

3v324v23 Claude Opus 4.5 commited on

refactor: Adapter pattern - LMStudioAdapter owns httpx, ComparisonSession uses MCP
f36c1bf

3v324v23 Claude Opus 4.5 commited on

refactor(ui): UI Surgery - shared header, remove Stability tab
cf52362

3v324v23 Claude Opus 4.5 commited on

refactor: BatteryRunner owns pool directly, uses list_models MCP
eef452b

3v324v23 Claude Opus 4.5 commited on

feat(mcp): Add complete and judge primitives, rename dispatcher
d68a3c8

3v324v23 Claude Opus 4.5 commited on

feat(mcp): Add list_models MCP tool - first primitive
416626b

3v324v23 Claude Opus 4.5 commited on