Jianzong Wu

jianzongwu

3 37 7

https://jianzongwu.github.io

jianzongwu

AI & ML interests

Multimodal Learning

Recent Activity

upvoted a paper 6 days ago

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

upvoted a paper 6 days ago

DanceOPD: On-Policy Generative Field Distillation

upvoted a paper 7 days ago

Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models

View all activity

Organizations

None yet

upvoted 2 papers 6 days ago

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

Paper • 2606.26907 • Published 7 days ago • 48

DanceOPD: On-Policy Generative Field Distillation

Paper • 2606.27377 • Published 7 days ago • 80

upvoted a paper 7 days ago

Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models

Paper • 2606.25041 • Published 9 days ago • 111

upvoted a paper 10 days ago

PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models

Paper • 2606.19534 • Published 15 days ago • 64

upvoted a paper 27 days ago

LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing

Paper • 2606.06042 • Published 28 days ago • 24

submitted a paper to Daily Papers 27 days ago

LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing

Paper • 2606.06042 • Published 28 days ago • 24

upvoted a paper about 1 month ago

Towards Customized Multimodal Role-Play

Paper • 2605.08129 • Published May 1 • 10

upvoted a paper 2 months ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71

upvoted a paper 3 months ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 168

upvoted 2 papers 4 months ago

Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance

Paper • 2603.02175 • Published Mar 2 • 24

Enhancing Spatial Understanding in Image Generation via Reward Modeling

Paper • 2602.24233 • Published Feb 27 • 60

upvoted 5 papers 5 months ago

Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation

Paper • 2601.21406 • Published Jan 29 • 6

Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135

Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

Paper • 2601.17058 • Published Jan 22 • 190

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 55

SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published Jan 22 • 44

upvoted a paper 6 months ago

PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation

Paper • 2512.24551 • Published Dec 31, 2025 • 21

upvoted 2 papers 7 months ago

DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

Paper • 2512.05112 • Published Dec 4, 2025 • 13

Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

Paper • 2512.02457 • Published Dec 2, 2025 • 14

commented a paper 7 months ago

Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

Paper • 2512.02457 • Published Dec 2, 2025 • 14 •

Jianzong Wu

AI & ML interests

Recent Activity

Organizations

jianzongwu's activity