ULTRA SPEED: 8-bit quantization, greedy decoding, 40 tokens, inference_mode c46fe44 Ilke Ileri commited on 26 days ago
Optimize for speed: reduce to 80 tokens, lower sampling params, add timing logs 190133f Ilke Ileri commited on 26 days ago
Improve conversation quality: use full history, increase tokens to 150, better sampling 39bb917 Ilke Ileri commited on 26 days ago
Optimize for speed: max_tokens=50, greedy decoding for real-time voice b9e9889 Ilke Ileri commited on 26 days ago
Add debug endpoint and improve error handling for Vapi troubleshooting 88b53d1 Ilke Ileri commited on 26 days ago
Reduce max_tokens to 100 for faster response to prevent Vapi timeout d54cae5 Ilke Ileri commited on 26 days ago
Fix: properly extract user messages from Vapi conversation history 0385f33 Ilke Ileri commited on 26 days ago
Add keyword-based sales filter to redirect off-topic questions 7242003 Ilke Ileri commited on 26 days ago
Fix: revert system prompt, add full OpenAI-compatible response format a1e3c35 Ilke Ileri commited on 26 days ago
Fix CORS issues and add request logging for Vapi integration e48c956 Ilke Ileri commited on 26 days ago