1 33 18

Yuxuan Zhang

Reacherx

AI & ML interests

None yet

Recent Activity

authored a paper 7 days ago

Learning from the Self-future: On-policy Self-distillation for dLLMs

upvoted a paper 8 days ago

Learning from the Self-future: On-policy Self-distillation for dLLMs

upvoted a paper 8 days ago

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

View all activity

Organizations

upvoted 2 papers 8 days ago

Learning from the Self-future: On-policy Self-distillation for dLLMs

Paper • 2606.18195 • Published 10 days ago • 74

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Paper • 2606.18023 • Published 10 days ago • 204

upvoted a paper 13 days ago

Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback

Paper • 2606.06113 • Published 22 days ago • 15

upvoted a paper 17 days ago

OpenSkill: Open-World Self-Evolution for LLM Agents

Paper • 2606.06741 • Published 22 days ago • 28

upvoted a paper 22 days ago

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Paper • 2605.30288 • Published 28 days ago • 23

upvoted 2 papers 3 months ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 265

Watch Before You Answer: Learning from Visually Grounded Post-Training

Paper • 2604.05117 • Published Apr 6 • 36

upvoted 2 papers 4 months ago

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 201

PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 229

upvoted 2 papers 8 months ago

Cambrian-S: Towards Spatial Supersensing in Video

Paper • 2511.04670 • Published Nov 6, 2025 • 40

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Paper • 2510.10666 • Published Oct 12, 2025 • 29

upvoted 3 papers 9 months ago

UniVideo: Unified Understanding, Generation, and Editing for Videos

Paper • 2510.08377 • Published Oct 9, 2025 • 81

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 112

Dr. Bench: A Multidimensional Evaluation for Deep Research Agents, from Answers to Reports

Paper • 2510.02190 • Published Jan 29 • 20

upvoted an article 9 months ago

Article

Putting RL back in RLHF

vwxyzjn, ArashAhmadian

•

Jun 12, 2024

• 111

upvoted 2 papers 9 months ago

VideoScore2: Think before You Score in Generative Video Evaluation

Paper • 2509.22799 • Published Sep 26, 2025 • 26

Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning

Paper • 2509.22824 • Published Sep 26, 2025 • 21

upvoted 2 papers 10 months ago

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7, 2025 • 151

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1, 2025 • 81

upvoted a paper 11 months ago

Are Reasoning Models More Prone to Hallucination?

Paper • 2505.23646 • Published May 29, 2025 • 24

Yuxuan Zhang

AI & ML interests

Recent Activity

Organizations

Reacherx's activity

Putting RL back in RLHF