Commit History

ULTRA SPEED: 8-bit quantization, greedy decoding, 40 tokens, inference_mode
c46fe44

Ilke Ileri commited on

Optimize for speed: reduce to 80 tokens, lower sampling params, add timing logs
190133f

Ilke Ileri commited on

Improve conversation quality: use full history, increase tokens to 150, better sampling
39bb917

Ilke Ileri commited on

Add /v1/chat/completions route for Vapi compatibility
2311595

Ilke Ileri commited on

Optimize for speed: max_tokens=50, greedy decoding for real-time voice
b9e9889

Ilke Ileri commited on

Add debug endpoint and improve error handling for Vapi troubleshooting
88b53d1

Ilke Ileri commited on

Reduce max_tokens to 100 for faster response to prevent Vapi timeout
d54cae5

Ilke Ileri commited on

Add streaming support for Vapi compatibility
0ccd1fa

Ilke Ileri commited on

Fix: remove global counter causing Space crash
c6b30d3

Ilke Ileri commited on

Add request counter to track incoming Vapi requests
dcb9815

Ilke Ileri commited on

Fix: properly extract user messages from Vapi conversation history
0385f33

Ilke Ileri commited on

Enhance request logging for better Vapi debugging
260002e

Ilke Ileri commited on

Add keyword-based sales filter to redirect off-topic questions
7242003

Ilke Ileri commited on

Fix: revert system prompt, add full OpenAI-compatible response format
a1e3c35

Ilke Ileri commited on

Add system prompt guard to enforce sales-only responses
a73c020

Ilke Ileri commited on

Fix CORS issues and add request logging for Vapi integration
e48c956

Ilke Ileri commited on

Improve response formatting and generation parameters
e683a4c

Ilke Ileri commited on

Add HF_TOKEN support for gated model access
25839d0

Ilke Ileri commited on

Add Vapi Gemma API application
692ef6b

Ilke Ileri commited on

initial commit
3ea09f6
verified

ilkeileri commited on