Commit History

Cache causal mask for faster inference
89ad9ef
Running
verified

LisaMegaWatts commited on

Fix completion_tokens: count tokens not decoded characters
80c1b86
verified

LisaMegaWatts commited on

Switch to HTTP.serve (non-streaming handler) for proxy compatibility
d353df1
verified

LisaMegaWatts commited on

Fix: add Content-Length for non-streaming responses, X-Accel-Buffering for SSE
6c7eeb4
verified

LisaMegaWatts commited on

Upload server.jl with huggingface_hub
10ad6b5
verified

LisaMegaWatts commited on

Upload folder using huggingface_hub
18db312
verified

LisaMegaWatts commited on