backend / main.py

Commit History

Update main.py
f0edfa1
Running
verified

Rofati commited on

Update main.py
ce81f75
verified

Rofati commited on

Update main.py
c8a9818
verified

Rofati commited on

Update main.py
ce782c7
verified

Rofati commited on

Update main.py
53f615a
verified

Rofati commited on

Update main.py
468632b
verified

Rofati commited on

Update main.py
d3accc5
verified

Rofati commited on

Revert to Llama-3.2-1B Q8_0 (proven 10-24s responses) — fastest working config"
89f8315
verified

Rofati commited on

Switch to SmolLM2-360M Q8_0 (386MB = 3.4x smaller = 3x faster inference)"
f2f20cb
verified

Rofati commited on

Revert main.py to exact original working code + n_batch=512 speedup"
65eb6d3
verified

Rofati commited on

main.py: use Q4_K_M path, optimized for OpenBLAS build"
5214ad9
verified

Rofati commited on

Revert main.py to working config with speed tweaks (n_batch=512, max_tokens=100, only last 2 msgs)"
3300482
verified

Rofati commited on

Ultra-minimal: max_tokens=40, n_ctx=256 for fastest possible inference"
13ac85e
verified

Rofati commited on

Add non-streaming /api/chat/sync endpoint + keep SSE with heartbeat"
63473f8
verified

Rofati commited on

Fix: send immediate SSE heartbeat to prevent proxy timeout"
d0cd401
verified

Rofati commited on

main.py: Llama-3.2-1B Q4_K_M, optimized for speed (n_batch=512, max_tokens=100, streaming)"
6be260f
verified

Rofati commited on

Fix: pre-fill think block in prompt so model starts answering immediately"
add811b
verified

Rofati commited on

Fix timeout: max_tokens=80, ensure response completes within proxy timeout
7bb0a62
verified

Rofati commited on

Fix: handle Qwen3 think mode — skip think tokens, emit only real content"
5376cea
verified

Rofati commited on

Fix: remove think block, use direct prompt without thinking mode"
ffefba6
verified

Rofati commited on

Optimized main.py: Qwen3-0.6B, /no_think, n_batch=512, max_tokens=100
6ed7384
verified

Rofati commited on

Update main.py
86cfd84
verified

Rofati commited on

Update main.py
c613ab4
verified

Rofati commited on

Update main.py
010322d
verified

Rofati commited on

Update main.py
dee3ef1
verified

Rofati commited on

Update main.py
2f3ca05
verified

Rofati commited on

Update main.py
e9fc6d5
verified

Rofati commited on

Update main.py
fbcf15c
verified

Rofati commited on

Update main.py
5968bfd
verified

Rofati commited on

Update main.py
bf8daad
verified

Rofati commited on

Update main.py
d64a731
verified

Rofati commited on

Update main.py
49782c6
verified

Rofati commited on

Update main.py
317c1b6
verified

Rofati commited on

Create main.py
4e80d7e
verified

Rofati commited on

Delete main.py
b893a8e
verified

Rofati commited on

Update main.py
5f454e4
verified

Rofati commited on

Create main.py
9ab5822
verified

Rofati commited on