From Bytes to Ideas: Language Modeling with Autoregressive U-Nets Paper • 2506.14761 • Published Jun 17 • 17
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling Paper • 2507.07955 • Published Jul 10 • 26