Speed: 4-bit default on Spaces, SDPA option, lower token limits; CUDA greedy fix b904a07 SebAustin commited on Feb 24