Open to Work

9 18

Yiming Zhao

gaotiexinqu

gaotiexinqu

AI & ML interests

VLMs, Agent, RL, Reasoning

Recent Activity

upvoted a paper 9 minutes ago

OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation

authored a paper about 24 hours ago

Flow-OPD: On-Policy Distillation for Flow Matching Models

authored a paper about 24 hours ago

SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

View all activity

Organizations

upvoted a paper 9 minutes ago

OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation

Paper • 2605.12480 • Published 2 days ago • 4

authored 2 papers about 24 hours ago

Flow-OPD: On-Policy Distillation for Flow Matching Models

Paper • 2605.08063 • Published 6 days ago • 88

SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

Paper • 2605.08043 • Published 6 days ago • 9

upvoted 2 papers 3 days ago

SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

Paper • 2605.08043 • Published 6 days ago • 9

Flow-OPD: On-Policy Distillation for Flow Matching Models

Paper • 2605.08063 • Published 6 days ago • 88

authored a paper 23 days ago

SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

Paper • 2604.17308 • Published 25 days ago • 22

liked a dataset 23 days ago

zhang-ziao/SkillFlow-Task

Updated 23 days ago • 734 • 4

upvoted a paper 23 days ago

SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

Paper • 2604.17308 • Published 25 days ago • 22

liked a Space 3 months ago

Image Arena Leaderboard

📊

596

Image Generation and Image Editing Arena & Leaderboard

updated a dataset 3 months ago

gaotiexinqu/V2P-Bench

Viewer • Updated Feb 5 • 1.17k • 49 • 2

upvoted a paper 3 months ago

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Paper • 2601.22060 • Published Jan 29 • 155

authored 4 papers 3 months ago

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Paper • 2504.07956 • Published Apr 10, 2025 • 46

V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction

Paper • 2503.17736 • Published Mar 22, 2025 • 3

Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models

Paper • 2510.01304 • Published Oct 1, 2025 • 11

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Paper • 2602.02185 • Published Feb 2 • 118

upvoted 2 papers 3 months ago

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Paper • 2602.02185 • Published Feb 2 • 118

V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction

Paper • 2503.17736 • Published Mar 22, 2025 • 3

updated a dataset 5 months ago

gaotiexinqu/Linguistic_Prior

Updated Dec 29, 2025 • 25

published a dataset 5 months ago

gaotiexinqu/Linguistic_Prior

Updated Dec 29, 2025 • 25

updated a dataset 10 months ago

gaotiexinqu/Temporal_RL

Updated Jul 17, 2025 • 66

Yiming Zhao

AI & ML interests

Recent Activity

Organizations

gaotiexinqu's activity

Image Arena Leaderboard