20 37 40

Kaiyan Zhang

iseesaw

https://iseesaw.github.io/

AI & ML interests

Large Reasoning Models, Reinforcement Learning, Agent

Recent Activity

authored a paper about 3 hours ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

updated a Space 1 day ago

FrontisAI/README

published a Space 1 day ago

FrontisAI/README

View all activity

Organizations

authored a paper about 3 hours ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Paper • 2606.24530 • Published 3 days ago • 53

updated a Space 1 day ago

README

🐢

published a Space 1 day ago

README

🐢

upvoted 2 papers 1 day ago

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published 3 days ago • 105

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Paper • 2606.24530 • Published 3 days ago • 53

submitted a paper to Daily Papers 1 day ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Paper • 2606.24530 • Published 3 days ago • 53

authored 8 papers 2 days ago

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Paper • 2509.26628 • Published Sep 30, 2025 • 17

From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models

Paper • 2509.25373 • Published Sep 29, 2025

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Paper • 2512.16649 • Published Dec 18, 2025 • 31

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

Emotion-Director: Bridging Affective Shortcut in Emotion-Oriented Image Generation

Paper • 2512.19479 • Published Dec 22, 2025 • 1

PRISMA: Reinforcement Learning Guided Two-Stage Policy Optimization in Multi-Agent Architecture for Open-Domain Multi-Hop Question Answering

Paper • 2601.05465 • Published Jan 9

MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation

Paper • 2604.14564 • Published Apr 16 • 1

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 4 days ago • 74

upvoted a paper 2 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 4 days ago • 74

submitted a paper to Daily Papers 2 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 4 days ago • 74

authored a paper about 1 month ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published May 18 • 30

upvoted a paper about 1 month ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published May 18 • 30

upvoted a paper 4 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

upvoted an article 4 months ago

Article

Forge: Scalable Agent RL Framework and Algorithm

MiniMax-AI

•

Feb 13

• 155

Kaiyan Zhang

AI & ML interests

Recent Activity

Organizations

iseesaw's activity

README

README

Forge: Scalable Agent RL Framework and Algorithm