Yuseung "Phillip" Lee's picture

Yuseung "Phillip" Lee

phillipinseoul

·

https://phillipinseoul.github.io/

phillipinseoul

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 29 days ago

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

upvoted a paper 29 days ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

upvoted a paper 29 days ago

ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

View all activity

Organizations

upvoted 3 papers 29 days ago

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

Paper • 2604.22875 • Published Apr 23 • 35

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published 30 days ago • 118

ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

Paper • 2604.24300 • Published 30 days ago • 67

upvoted 13 papers about 1 month ago

Exploring Spatial Intelligence from a Generative Perspective

Paper • 2604.20570 • Published Apr 22 • 22

(1D) Ordered Tokens Enable Efficient Test-Time Search

Paper • 2604.15453 • Published Apr 16 • 18

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

Paper • 2604.18564 • Published Apr 20 • 46

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published Apr 20 • 95

Learning Adaptive Reasoning Paths for Efficient Visual Reasoning

Paper • 2604.14568 • Published Apr 16 • 10

Qwen3.5-Omni Technical Report

Paper • 2604.15804 • Published Apr 17 • 59

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Paper • 2604.14144 • Published Apr 15 • 63

You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass

Paper • 2604.10966 • Published Apr 13 • 12

Lyra 2.0: Explorable Generative 3D Worlds

Paper • 2604.13036 • Published Apr 14 • 41

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published Apr 9 • 291

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

Paper • 2604.09531 • Published Apr 10 • 8

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 247

upvoted 4 papers about 2 months ago

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

Paper • 2604.07296 • Published Apr 8 • 40

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 326

HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents

Paper • 2604.07430 • Published Apr 8 • 189

RAGEN-2: Reasoning Collapse in Agentic RL

Paper • 2604.06268 • Published Apr 7 • 67