24 1

Guangnian Wan PRO

bigglesworthnotcat

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

ViMU: Benchmarking Video Metaphorical Understanding

upvoted a paper 7 days ago

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

upvoted a paper 19 days ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

View all activity

Organizations

upvoted a paper 5 days ago

ViMU: Benchmarking Video Metaphorical Understanding

Paper • 2605.14607 • Published 6 days ago • 12

upvoted a paper 7 days ago

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

Paper • 2605.11882 • Published 8 days ago • 16

upvoted a paper 19 days ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published 20 days ago • 90

upvoted a paper 22 days ago

Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

Paper • 2604.23775 • Published 24 days ago • 45

upvoted a paper about 1 month ago

DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published Apr 9 • 52

upvoted 3 papers about 2 months ago

Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers

Paper • 2603.27666 • Published Mar 29 • 18

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

Paper • 2505.18675 • Published May 24, 2025 • 27

Make Geometry Matter for Spatial Reasoning

Paper • 2603.26639 • Published Mar 27 • 32

upvoted 2 papers 2 months ago

Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models

Paper • 2603.15557 • Published Mar 16 • 29

ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer

Paper • 2603.15478 • Published Mar 16 • 24

upvoted a paper 3 months ago

dVoting: Fast Voting for dLLMs

Paper • 2602.12153 • Published Feb 12 • 22

upvoted 2 papers 4 months ago

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 178

Video models are zero-shot learners and reasoners

Paper • 2509.20328 • Published Sep 24, 2025 • 100

upvoted a paper 5 months ago

WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Paper • 2512.19678 • Published Dec 22, 2025 • 32

upvoted 2 papers 6 months ago

Vision Bridge Transformer at Scale

Paper • 2511.23199 • Published Nov 28, 2025 • 46

In-Video Instructions: Visual Signals as Generative Control

Paper • 2511.19401 • Published Nov 24, 2025 • 32

upvoted a paper 7 months ago

MixReasoning: Switching Modes to Think

Paper • 2510.06052 • Published Oct 7, 2025 • 23

upvoted a paper 11 months ago

Discrete Diffusion in Large Language and Multimodal Models: A Survey

Paper • 2506.13759 • Published Jun 16, 2025 • 43

upvoted 2 papers 12 months ago

Image Editing As Programs with Diffusion Models

Paper • 2506.04158 • Published Jun 4, 2025 • 24

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Paper • 2505.17941 • Published May 23, 2025 • 25

Guangnian Wan PRO

AI & ML interests

Recent Activity

Organizations

bigglesworthnotcat's activity