Running 1 AMA-Bench Leaderboard 🧠1 Explore and compare AI model performance with interactive charts
Running 1 AMA-Bench Leaderboard 🧠1 Explore and compare AI model performance with interactive charts
Running 1 AMA-Bench Leaderboard 🧠1 Explore and compare AI model performance with interactive charts
Running 1 AMA-Bench Leaderboard 🧠1 Explore and compare AI model performance with interactive charts
Benchmarking Scientific Understanding and Reasoning for Video Generation using VideoScience-Bench Paper • 2512.02942 • Published Dec 2, 2025 • 5
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 42
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 42
Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs Paper • 2510.11062 • Published Oct 13, 2025 • 29
lmgame-Bench: How Good are LLMs at Playing Games? Paper • 2505.15146 • Published May 21, 2025 • 20
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs Paper • 2412.11242 • Published Dec 15, 2024 • 1
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration Paper • 2502.00675 • Published Feb 2, 2025 • 2
GameArena: Evaluating LLM Reasoning through Live Computer Games Paper • 2412.06394 • Published Dec 9, 2024 • 1
PockEngine: Sparse and Efficient Fine-tuning in a Pocket Paper • 2310.17752 • Published Oct 26, 2023 • 15
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput Paper • 2406.14066 • Published Jun 20, 2024 • 3