--- library_name: diffusers pipeline_tag: image-text-to-text --- This repository contains the model described in the paper [Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation](https://arxiv.org/abs/2502.05415). To get started with the 512-resolution model: ```bash sh inference_t2i_512.sh ``` To get started with the 256-resolution model: ```bash sh inference_t2i_256.sh ```