OpenTinker: Separating Concerns in Agentic Reinforcement Learning Paper • 2601.07376 • Published 17 days ago • 6
OpenTinker: Separating Concerns in Agentic Reinforcement Learning Paper • 2601.07376 • Published 17 days ago • 6
Benchmarking Scientific Understanding and Reasoning for Video Generation using VideoScience-Bench Paper • 2512.02942 • Published Dec 2, 2025 • 5
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 42
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 42
Multi-Agent Evolve: LLM Self-Improve through Co-evolution Paper • 2510.23595 • Published Oct 27, 2025 • 12
Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs Paper • 2310.16355 • Published Oct 25, 2023
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 39
Toward Inference-optimal Mixture-of-Expert Large Language Models Paper • 2404.02852 • Published Apr 3, 2024
LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch Paper • 2501.07124 • Published Jan 13, 2025
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published Jun 17, 2025 • 49
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20, 2025 • 123
Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs Paper • 2510.11062 • Published Oct 13, 2025 • 29
GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare Paper • 2510.08872 • Published Oct 10, 2025 • 4
Scaling Speculative Decoding with Lookahead Reasoning Paper • 2506.19830 • Published Jun 24, 2025 • 13
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs Paper • 2412.11242 • Published Dec 15, 2024 • 1
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration Paper • 2502.00675 • Published Feb 2, 2025 • 2