Efficient Context Scaling with LongCat ZigZag Attention Paper • 2512.23966 • Published 30 days ago • 7
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives Paper • 2512.14699 • Published Dec 16, 2025 • 28
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 102
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper • 2510.21614 • Published Oct 24, 2025 • 22
MemMamba: Rethinking Memory Patterns in State Space Model Paper • 2510.03279 • Published Sep 28, 2025 • 72
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8, 2025 • 31
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 506
EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24, 2025 • 44
Tiny Language Model Datasets Collection Collection of Synthetic Datasets that can be used in pretraining of any the Tiny Language Model • 14 items • Updated 2 days ago • 29
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2, 2025 • 69
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 223
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning Paper • 2507.14137 • Published Jul 18, 2025 • 35