| library_name: diffusers | |
| pipeline_tag: image-text-to-text | |
| This repository contains the model described in the paper [Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation](https://arxiv.org/abs/2502.05415). | |
| To get started with the 512-resolution model: | |
| ```bash | |
| sh inference_t2i_512.sh | |
| ``` | |
| To get started with the 256-resolution model: | |
| ```bash | |
| sh inference_t2i_256.sh | |
| ``` |