29 4

zqyz

zqyz333

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics

upvoted a paper 2 days ago

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

upvoted a paper 10 days ago

Draft-OPD: On-Policy Distillation for Speculative Draft Models

View all activity

Organizations

None yet

upvoted a paper about 13 hours ago

ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics

Paper • 2606.10479 • Published 3 days ago • 18

upvoted a paper 2 days ago

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

Paper • 2606.07591 • Published 15 days ago • 85

upvoted 2 papers 10 days ago

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Paper • 2605.29343 • Published 15 days ago • 34

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Paper • 2605.31264 • Published 14 days ago • 111

upvoted 2 papers 21 days ago

π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows

Paper • 2605.14678 • Published 24 days ago • 105

ACC: Compiling Agent Trajectories for Long-Context Training

Paper • 2605.21850 • Published 22 days ago • 59

upvoted a paper 2 months ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published Apr 6 • 123

upvoted 4 papers 3 months ago

upvoted 2 papers 5 months ago

TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization

Paper • 2601.16480 • Published Jan 23 • 50

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Paper • 2601.14004 • Published Jan 20 • 49

liked a dataset 5 months ago

lighteval/MATH-Hard

Viewer • Updated Jun 12, 2024 • 7.26k • 3.76k • 24

upvoted a paper 6 months ago

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Paper • 2512.16969 • Published Dec 18, 2025 • 120

upvoted a collection 6 months ago

SGI-Bench

Collection

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows • 12 items • Updated May 6 • 33

upvoted 2 papers 6 months ago

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Paper • 2512.10534 • Published Dec 11, 2025 • 33

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published Dec 11, 2025 • 35

upvoted a paper 7 months ago

GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

Paper • 2511.15705 • Published Nov 19, 2025 • 98

upvoted a paper 8 months ago

PICABench: How Far Are We from Physically Realistic Image Editing?

Paper • 2510.17681 • Published Oct 20, 2025 • 65

zqyz

AI & ML interests

Recent Activity

Organizations

zqyz333's activity