Hao Li

Richardleee

29

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 24 days ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

upvoted a paper 24 days ago

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

upvoted a paper about 2 months ago

OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

View all activity

Organizations

upvoted 2 papers 24 days ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 27 days ago • 40

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Paper • 2606.05259 • Published 27 days ago • 39

upvoted 3 papers about 2 months ago

OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

Paper • 2605.04036 • Published May 5 • 72

ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration

Paper • 2605.03042 • Published May 4 • 140

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Paper • 2605.04018 • Published May 5 • 41

upvoted a paper 5 months ago

SAGE: Benchmarking and Improving Retrieval for Deep Research Agents

Paper • 2602.05975 • Published Feb 5 • 12

upvoted a paper 11 months ago

CellForge: Agentic Design of Virtual Cell Models

Paper • 2508.02276 • Published Aug 4, 2025 • 40

upvoted 2 papers 12 months ago

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Paper • 2507.06229 • Published Jul 8, 2025 • 78

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Paper • 2507.01001 • Published Jul 1, 2025 • 47

upvoted 11 papers about 1 year ago

Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure

Paper • 2506.12278 • Published Jun 13, 2025 • 16

The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason

Paper • 2505.22653 • Published May 28, 2025 • 43

ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published May 29, 2025 • 45

Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29, 2025 • 69

SWE-bench Goes Live!

Paper • 2505.23419 • Published May 29, 2025 • 21

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 79

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 56

Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1

Paper • 2503.24376 • Published Mar 31, 2025 • 38

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published Mar 31, 2025 • 76

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published Apr 1, 2025 • 27

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

Paper • 2503.22675 • Published Mar 28, 2025 • 36