Project-O1

community

AI & ML interests

None defined yet.

Recent Activity

zsqzz submitted a paper 2 days ago

Agentic AI Systems Should Be Designed as Marginal Token Allocators

zsqzz authored a paper 4 months ago

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

zsqzz submitted a paper 4 months ago

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

View all activity

submitted a paper to Daily Papers 2 days ago

Agentic AI Systems Should Be Designed as Marginal Token Allocators

Paper • 2605.01214 • Published 6 days ago • 3

authored a paper 4 months ago

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Paper • 2601.07376 • Published Jan 12 • 7

submitted a paper to Daily Papers 4 months ago

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Paper • 2601.07376 • Published Jan 12 • 7

authored 2 papers 5 months ago

Benchmarking Scientific Understanding and Reasoning for Video Generation using VideoScience-Bench

Paper • 2512.02942 • Published Dec 2, 2025 • 5

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

Paper • 2512.14681 • Published Dec 16, 2025 • 42

submitted a paper to Daily Papers 5 months ago

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

Paper • 2512.14681 • Published Dec 16, 2025 • 42

authored a paper 6 months ago

Multi-Agent Evolve: LLM Self-Improve through Co-evolution

Paper • 2510.23595 • Published Oct 27, 2025 • 13

authored 7 papers 7 months ago

Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

Paper • 2310.16355 • Published Oct 25, 2023

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 43

Toward Inference-optimal Mixture-of-Expert Large Language Models

Paper • 2404.02852 • Published Apr 3, 2024

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Paper • 2501.07124 • Published Jan 13, 2025

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17, 2025 • 50

K2-Think: A Parameter-Efficient Reasoning System

Paper • 2509.07604 • Published Sep 9, 2025 • 14

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20, 2025 • 124

authored a paper 7 months ago

Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs

Paper • 2510.11062 • Published Oct 13, 2025 • 29

authored a paper 7 months ago

GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare

Paper • 2510.08872 • Published Oct 10, 2025 • 4

authored a paper 9 months ago

Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21, 2025 • 90

authored a paper 11 months ago

Scaling Speculative Decoding with Lookahead Reasoning

Paper • 2506.19830 • Published Jun 24, 2025 • 13

authored a paper 12 months ago

lmgame-Bench: How Good are LLMs at Playing Games?

Paper • 2505.15146 • Published May 21, 2025 • 20

authored a paper about 1 year ago

TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs

Paper • 2412.11242 • Published Dec 15, 2024 • 1