view article Article CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models about 19 hours ago • 4
view article Article EMO: Pretraining mixture of experts for emergent modularity about 20 hours ago • 17
ÜberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset Paper • 2602.15210 • Published Feb 25 • 1
Kakugo: Distillation of Low-Resource Languages into Small Language Models Paper • 2601.14051 • Published Jan 20 • 1
Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials Paper • 2404.16829 • Published Apr 25, 2024 • 5
Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM Paper • 2512.21580 • Published Dec 25, 2025 • 9
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans Paper • 2507.02861 • Published Jul 3, 2025 • 3
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time Paper • 2512.25075 • Published Dec 31, 2025 • 16
Draft-Thinking: Learning Efficient Reasoning in Long Chain-of-Thought LLMs Paper • 2603.00578 • Published Feb 28 • 2
DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper • 2504.11393 • Published Apr 15, 2025 • 20
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining Paper • 2504.16511 • Published Apr 23, 2025 • 23
BARE: Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation Paper • 2502.01697 • Published Feb 3, 2025 • 2
LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning Paper • 2505.07437 • Published May 12, 2025 • 2
SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response Ranking Paper • 2406.10882 • Published Jun 16, 2024 • 3
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation Paper • 2402.18191 • Published Feb 28, 2024 • 2
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities Paper • 2501.12147 • Published Jan 21, 2025 • 2