Wentian Zhao

zwt123home123

zhaowt615@gmail.com

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

From RLVR to RLSVR: Task Transformation Induces Self-Verifiable Rewards for Open-Ended LLM Self-Improvement

upvoted a paper 10 days ago

Visual Contrastive Self-Distillation

upvoted a paper 2 months ago

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

View all activity

Organizations

upvoted a paper about 13 hours ago

From RLVR to RLSVR: Task Transformation Induces Self-Verifiable Rewards for Open-Ended LLM Self-Improvement

Paper • 2607.23802 • Published 9 days ago • 65

upvoted a paper 10 days ago

Visual Contrastive Self-Distillation

Paper • 2607.21556 • Published 12 days ago • 51

upvoted a paper 2 months ago

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

Paper • 2605.24202 • Published May 22 • 17

updated a model 5 months ago

SpyRL/SpyRL-Qwen3-4B-Summarization

Text Generation • 4B • Updated 8 days ago • 22

published a model 5 months ago

SpyRL/SpyRL-Qwen3-4B-Summarization

Text Generation • 4B • Updated 8 days ago • 22

updated a model 5 months ago

SpyRL/SpyRL-Qwen3-4B-Writing

Text Generation • 4B • Updated 8 days ago • 30

published a model 5 months ago

SpyRL/SpyRL-Qwen3-4B-Writing

Text Generation • 4B • Updated 8 days ago • 30

updated a model 5 months ago

SpyRL/SpyRL-Qwen3-8B-Math

Text Generation • 8B • Updated 8 days ago • 23

published a model 5 months ago

SpyRL/SpyRL-Qwen3-8B-Math

Text Generation • 8B • Updated 8 days ago • 23

updated a model 5 months ago

SpyRL/SpyRL-Qwen3-4B-Math

Text Generation • 4B • Updated 8 days ago • 21

published a model 5 months ago

SpyRL/SpyRL-Qwen3-4B-Math

Text Generation • 4B • Updated 8 days ago • 21

upvoted a paper 5 months ago

MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Paper • 2603.09206 • Published Mar 10 • 54

upvoted 2 papers 6 months ago

What does RL improve for Visual Reasoning? A Frankenstein-Style Analysis

Paper • 2602.12395 • Published Feb 12 • 17

Quantifying the Gap between Understanding and Generation within Unified Multimodal Models

Paper • 2602.02140 • Published Feb 2 • 12

upvoted 2 papers 7 months ago

Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Paper • 2512.19995 • Published Dec 23, 2025 • 16

Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

Paper • 2512.18880 • Published Dec 21, 2025 • 25

upvoted a paper 9 months ago

Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

Paper • 2511.07419 • Published Nov 10, 2025 • 27

upvoted 2 papers 10 months ago

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published Sep 29, 2025 • 142

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26, 2025 • 137

upvoted a paper about 1 year ago

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10, 2025 • 36

Wentian Zhao

AI & ML interests

Recent Activity

Organizations

zwt123home123's activity