Scaling Law Discovery Collection Dataset and results for SLD (https://arxiv.org/abs/2507.21184) • 2 items • Updated 18 days ago • 2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 5 days ago • 63
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published Dec 19, 2024 • 38
CIMemories: A Compositional Benchmark for Contextual Integrity of Persistent Memory in LLMs Paper • 2511.14937 • Published Nov 18, 2025 • 1
Foundation-Sec-8B Collection Foundation-Sec-8B models and quantizations. • 7 items • Updated Nov 20, 2025 • 6
GPT-OSS Pruned Experts (4.2B-20B) [IF, Science, Math, etc.] Collection Complete collection of domain-specialized GPT-OSS models (1-32 experts) optimized for science, math, medicine, law, safety, and instruction following. • 8 items • Updated Aug 13, 2025 • 10
Tool Use Reasoning Collection A collection of tool use reasoning dataset in Hermes format • 5 items • Updated Jul 23, 2025 • 9
GLiNER-PII Collection PII detection models developed in collaboration with Wordcab • 5 items • Updated Sep 24, 2025 • 21
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14, 2025 • 60
OPT Collection OPT (Open Pretrained Transformer) is a series of open-sourced large causal language models which perform similar in performance to GPT3. • 12 items • Updated Nov 21, 2024 • 8
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 27 items • Updated 14 days ago • 136