metadata
library_name: diffusers
pipeline_tag: image-text-to-text
This repository contains the model described in the paper Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation.
To get started with the 512-resolution model:
sh inference_t2i_512.sh
To get started with the 256-resolution model:
sh inference_t2i_256.sh