QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176
KORMo pretraining datasets Collection The pretraining datasets for KORMo-10B were collected from diverse, publicly available source. • 14 items • Updated Oct 13 • 19
Tri Series Collection Introducing our new series of models: Tri-7B, Tri-21B, and Tri-70B-preview-SFT • 10 items • Updated Sep 10 • 8
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention Oct 7, 2024 • 61
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17 • 93
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10, 2024 • 47
VLR-Bench: Multilingual Benchmark Dataset for Vision-Language Retrieval Augmented Generation Paper • 2412.10151 • Published Dec 13, 2024 • 7
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8, 2024 • 172
Understanding Reference Policies in Direct Preference Optimization Paper • 2407.13709 • Published Jul 18, 2024 • 17
X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment Paper • 2403.11399 • Published Mar 18, 2024 • 6
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25, 2024 • 80