MonarchSLM / server.jl

Commit History

Cache Monarch matrices + causal mask for faster inference
76b7110
Running
verified

LisaMegaWatts commited on

Fix completion_tokens: count tokens not decoded characters
f0aedd4
verified

LisaMegaWatts commited on

Upload server.jl with huggingface_hub
91c86b7
verified

LisaMegaWatts commited on