fancyfeast/llama-joycaption-beta-one-hf-llava
Image-Text-to-Text β’ 8B β’ Updated β’ 57.1k β’ 338
fast video generation from images & text
Generate detailed captions for any image
Generate captions for images using text prompts
Generate synchronized audio from video or text prompts
Generate depth map from your photo
Generate creative prompts for Stable Diffusion images
Generate detailed AI prompts and tags from an image
A unified multimodal understanding and generation model.
Launch an interactive demo interface
Chat with Gemini 2.5 to get detailed responses