Hongje Seong

hongjeseong

22

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

SAM2Matting: Generalized Image and Video Matting

upvoted a paper 21 days ago

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

upvoted a paper 28 days ago

GenClaw: Code-Driven Agentic Image Generation

View all activity

Organizations

None yet

upvoted a paper 3 days ago

SAM2Matting: Generalized Image and Video Matting

Paper • 2606.27339 • Published 8 days ago • 7

upvoted a paper 21 days ago

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

Paper • 2605.26115 • Published May 25 • 52

upvoted 2 papers 28 days ago

GenClaw: Code-Driven Agentic Image Generation

Paper • 2605.30248 • Published May 28 • 40

Cosmos 3: Omnimodal World Models for Physical AI

Paper • 2606.02800 • Published Jun 1 • 138

upvoted 10 papers about 1 month ago

VLM3: Vision Language Models Are Native 3D Learners

Paper • 2605.30561 • Published May 28 • 26

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published May 28 • 146

Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction

Paper • 2605.26230 • Published May 25 • 41

SpatialBench: Is Your Spatial Foundation Model an All-Round Player?

Paper • 2605.27367 • Published May 26 • 72

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published May 26 • 145

CubePart: An Open-Vocabulary Part-Controllable 3D Generator

Paper • 2605.28763 • Published May 27 • 14

Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving

Paper • 2605.23163 • Published May 25 • 17

GEM: Generative Supervision Helps Embodied Intelligence

Paper • 2605.28548 • Published May 27 • 32

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published May 20 • 111

VGGT-Ω

Paper • 2605.15195 • Published May 14 • 3

upvoted 2 papers about 2 months ago

Asymmetric Flow Models

Paper • 2605.12964 • Published May 13 • 22

RLDX-1 Technical Report

Paper • 2605.03269 • Published May 5 • 126

upvoted 4 papers 2 months ago

Image Generators are Generalist Vision Learners

Paper • 2604.20329 • Published Apr 22 • 22

Vista4D: Video Reshooting with 4D Point Clouds

Paper • 2604.21915 • Published Apr 23 • 12

Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation

Paper • 2604.18168 • Published Apr 20 • 96

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 127