Ziheng Zhou

josephziheng

3 4 5

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Less is More: Early Stopping Rollout for On-Policy Distillation

submitted a paper about 1 month ago

Less is More: Early Stopping Rollout for On-Policy Distillation

reacted to salma-remyx's post with 🔥 2 months ago

SciCrafter measured something AI practitioners have intuited: frontier agents are improving at executing inside well-framed problems, but lag at framing the problem in the first place. GPT-5.2, Gemini-3-Pro, and Claude Opus 4.5 all plateaued near 26% on a new Minecraft benchmark for probing AI capabilities in the discovery-to-application loop. So the authors ran targeted interventions: * Hints about what to investigate doubled performance. * A structured experimentation template added 7-14 more points. * Structured consolidation beat free-form summaries by 6 points. * Curriculum context beat independent task-solving. These interventions helped the agent frame what’s worth investigating, and structure what gets learned so it compounds. The bottleneck for AI in scientific workflows is upstream of execution. Their findings are congruent with the design patterns we've adopted at Remyx AI to help AI teams close the development loop scientifically. Agents work well inside structured loops, but they perform poorly when tasked with creating the structure. Instrumenting your scientific workflows offers greater leverage than scaling compute with a less informed search. In the work of building production AI systems, teams are flying through execution. The bigger challenge is identifying which experiments moved which production outcome, or what to try next. One of the more interesting results I found this week by tracking work in AI for scientific workflows using Remyx: https://engine.remyx.ai/papers/d8f23b9b-b14b-4ada-b44e-ccfc221c06b4

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Less is More: Early Stopping Rollout for On-Policy Distillation

Paper • 2605.27028 • Published May 26 • 15

upvoted a paper 2 months ago

Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft

Paper • 2604.24697 • Published Apr 27 • 2

upvoted a paper 9 months ago

Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation

Paper • 2509.23866 • Published Sep 28, 2025 • 14

upvoted a paper 11 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 190

Ziheng Zhou

AI & ML interests

Recent Activity

Organizations

josephziheng's activity