1 14 4

Yuhao Zheng

yhzheng1031

https://scholar.google.com/citations?user=5aNNQhYAAAAJ&hl=zh-CN

yhzheng1031

AI & ML interests

MLLM, Agent, Reinforcement Learning

Recent Activity

upvoted a paper 3 days ago

From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

updated a model 7 days ago

GD-ML/Code2World

upvoted a paper 7 days ago

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

View all activity

Organizations

upvoted a paper 3 days ago

From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

Paper • 2603.00141 • Published 10 days ago • 129

upvoted a paper 7 days ago

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

Paper • 2602.22638 • Published 9 days ago • 104

upvoted a paper 9 days ago

HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation

Paper • 2602.18283 • Published 14 days ago • 53

upvoted a paper 24 days ago

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published 24 days ago • 198

upvoted a paper 29 days ago

FASA: Frequency-aware Sparse Attention

Paper • 2602.03152 • Published Feb 3 • 150

upvoted a paper about 1 month ago

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Paper • 2601.20614 • Published Jan 28 • 120

upvoted 3 papers about 2 months ago

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Paper • 2601.10477 • Published Jan 15 • 155

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published Jan 8 • 168

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Paper • 2512.24271 • Published Dec 30, 2025 • 63

upvoted a paper 3 months ago

InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners

Paper • 2504.14239 • Published Apr 19, 2025 • 14

upvoted a collection 3 months ago

InfiGUI: Advanced Vision-Native Agent for GUI Interaction

Collection

7 items • Updated Oct 15, 2025 • 1

upvoted a paper 3 months ago

Computer-Use Agents as Judges for Generative User Interface

Paper • 2511.15567 • Published Nov 19, 2025 • 53

upvoted 2 papers 4 months ago

Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking

Paper • 2505.12667 • Published May 19, 2025 • 9

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

Paper • 2511.02778 • Published Nov 4, 2025 • 102

Yuhao Zheng

AI & ML interests

Recent Activity

Organizations

yhzheng1031's activity