IE

classroom

AI & ML interests

None defined yet.

Recent Activity

wukeming11 authored a paper about 17 hours ago

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

wren93 authored a paper 4 days ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

wukeming11 authored a paper 4 days ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

View all activity

authored a paper about 17 hours ago

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

Paper • 2604.28123 • Published 7 days ago • 41

authored a paper 4 days ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published 11 days ago • 68

authored a paper 4 days ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published 8 days ago • 86

authored a paper 21 days ago

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published 25 days ago • 101

authored a paper 29 days ago

VecGlypher: Unified Vector Glyph Generation with Language Models

Paper • 2602.21461 • Published Feb 25 • 12

authored a paper about 1 month ago

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Paper • 2603.27862 • Published Mar 29 • 31

authored a paper about 1 month ago

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Paper • 2603.27862 • Published Mar 29 • 31

submitted a paper to Daily Papers about 1 month ago

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Paper • 2603.27862 • Published Mar 29 • 31

authored a paper 2 months ago

Visual-Aware CoT: Achieving High-Fidelity Visual Consistency in Unified Models

Paper • 2512.19686 • Published Dec 22, 2025

authored a paper 3 months ago

VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction

Paper • 2602.13294 • Published Feb 9 • 13

authored a paper 3 months ago

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36

authored 3 papers 4 months ago

Scaling Zero-Shot Reference-to-Video Generation

Paper • 2512.06905 • Published Dec 7, 2025 • 29

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Paper • 2512.07802 • Published Dec 8, 2025 • 46

HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

Paper • 2512.21338 • Published Dec 24, 2025 • 23

authored a paper 4 months ago

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 127

submitted a paper to Daily Papers 4 months ago

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 127

authored a paper 5 months ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 75

authored 3 papers 5 months ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 188

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30, 2025 • 19

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 96