Cache Monarch matrices + causal mask for faster inference 76b7110 Running verified LisaMegaWatts commited on about 1 month ago
Fix completion_tokens: count tokens not decoded characters f0aedd4 verified LisaMegaWatts commited on Feb 26