Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe Paper • 2606.20381 • Published 6 days ago • 9
DFlash Collection Block Diffusion for Flash Speculative Decoding • 22 items • Updated 9 days ago • 132