A Mechanistic View on Video Generation as World Models: State and Dynamics
Paper
•
2601.17067
•
Published
•
4
The Kling Team is building next-generation multimodal world models across video, audio, text, 3D, and beyond. We are continuously looking for exceptional talent to join us. Feel free to reach out!
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation
Klear: Unified Multi-Task Audio-Video Joint Generation