Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding Paper • 2606.21906 • Published 15 days ago • 24
DFlash Collection Block Diffusion for Flash Speculative Decoding • 23 items • Updated 6 days ago • 142
Magic Quant Collection MagicQuant is a benchmark-driven GGUF evaluation and hybrid-discovery system. https://github.com/magiccodingman/MagicQuant-Wiki • 5 items • Updated May 26 • 33
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 107 items • Updated 6 days ago • 746