naver-clova-ix/donut-base
Image-to-Text • Updated • 176k • 253
Audio Conditioned LipSync with Latent Diffusion Models
Generate consistent image sequences from text and photos
Import a portrait, click to move the head!
Edit images with scribble‑based color and edge control
Line Art Colorization with Precise Reference Following
Restore black-and-white photos to color
Track, rank and evaluate open LLMs and chatbots
Explore and submit LLM benchmarks
Transcribe audio files into text instantly
ALA