qingyang zhang

qingyangzhang

4 17 2

https://qingyangzhang.github.io

AI & ML interests

LLM Reasoning

Recent Activity

upvoted a paper about 2 months ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

updated a collection about 2 months ago

TEMPO

updated a collection about 2 months ago

TEMPO

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published May 13 • 165

upvoted 2 papers 2 months ago

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Paper • 2604.28139 • Published Apr 30 • 42

TEMPO: Scaling Test-time Training for Large Reasoning Models

Paper • 2604.19295 • Published Apr 21 • 35

upvoted a paper 4 months ago

V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration

Paper • 2603.13089 • Published Mar 13 • 13

upvoted a paper 5 months ago

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Paper • 2602.09443 • Published Feb 10 • 59

upvoted a paper 8 months ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 135

upvoted a paper 11 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320

upvoted 7 papers about 1 year ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 190

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 146

SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization

Paper • 2505.12346 • Published May 18, 2025 • 19

upvoted an article about 1 year ago

Article

Train Reasoning Models without External Supervision

qingyangzhang

•

May 18, 2025

• 1

upvoted a paper about 1 year ago

Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization

Paper • 2504.05812 • Published Apr 8, 2025 • 4

upvoted a collection about 1 year ago

EMPO

Collection

8 items • Updated Mar 2 • 3

qingyang zhang

AI & ML interests

Recent Activity

Organizations

qingyangzhang's activity

Train Reasoning Models without External Supervision