YUYUNYAO's picture

4 1

YUYUNYAO

Yyy195

·

Yyy195

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

upvoted a paper 3 months ago

When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning

upvoted a paper about 1 year ago

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

View all activity

Organizations

None yet

upvoted a paper 13 days ago

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Paper • 2606.09426 • Published 17 days ago • 102

upvoted a paper 3 months ago

When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning

Paper • 2603.21289 • Published Mar 22 • 35

upvoted 2 papers about 1 year ago

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

Paper • 2506.04141 • Published Jun 4, 2025 • 31

Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis

Paper • 2506.04142 • Published Jun 4, 2025 • 28

liked a dataset about 1 year ago

JokerJan/MMR-VBench

Viewer • Updated Jul 1, 2025 • 1.26k • 1.01k • 17