snowflakewang's picture

🔄 In a Training Loop

snowflakewang

SnowflakeWang

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

upvoted a paper 9 days ago

DreamX-World 1.0: A General-Purpose Interactive World Model

upvoted a paper 9 days ago

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

View all activity

Organizations

upvoted a paper 6 days ago

ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

Paper • 2606.19531 • Published 8 days ago • 18

upvoted 2 papers 9 days ago

DreamX-World 1.0: A General-Purpose Interactive World Model

Paper • 2606.16993 • Published 10 days ago • 110

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Paper • 2606.14777 • Published 15 days ago • 200

upvoted a paper 13 days ago

InterleaveThinker: Reinforcing Agentic Interleaved Generation

Paper • 2606.13679 • Published 14 days ago • 80

upvoted a paper 21 days ago

Qwen-Image-Flash: Beyond Objective Design

Paper • 2606.03746 • Published 23 days ago • 36

upvoted 6 papers about 1 month ago

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published May 20 • 111

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

Paper • 2605.14747 • Published May 14 • 147

Lance: Unified Multimodal Modeling by Multi-Task Synergy

Paper • 2605.18678 • Published May 18 • 79

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

Paper • 2605.14333 • Published May 14 • 35

Qwen-Image-VAE-2.0 Technical Report

Paper • 2605.13565 • Published May 13 • 62

Qwen-Image-2.0 Technical Report

Paper • 2605.10730 • Published May 11 • 114

upvoted 3 papers 2 months ago

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 127

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 166

Strips as Tokens: Artist Mesh Generation with Native UV Segmentation

Paper • 2604.09132 • Published Apr 10 • 56

upvoted 3 papers 3 months ago

Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation

Paper • 2604.02289 • Published Apr 2 • 15

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published Apr 2 • 56

WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation

Paper • 2603.16871 • Published Mar 17 • 61

upvoted 3 papers 4 months ago

HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

Paper • 2603.02210 • Published Mar 2 • 30

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 201

Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling

Paper • 2602.09084 • Published Feb 9 • 30