Fix f-string syntax error in streaming endpoint c53e66f Andrew McCracken Claude commited on Oct 14, 2025
Configure uvicorn for concurrent request handling 6b0a701 Andrew McCracken Claude commited on Oct 14, 2025
Add concurrent request handling with model pool efd4459 Andrew McCracken Claude commited on Oct 14, 2025
Revert to simpler configuration - optimizations caused slowdown 457c9e1 Andrew McCracken Claude commited on Oct 13, 2025
Optimize model parameters for faster CPU inference 6e83384 Andrew McCracken Claude commited on Oct 13, 2025
Fix: Force use of pre-built llama-cpp-python wheels a922ca8 Andrew McCracken commited on Oct 13, 2025
Fix: Install llama-cpp-python at runtime (HF Spaces workaround) 3fad655 Andrew McCracken commited on Oct 13, 2025