CoRe2

File size: 2,812 Bytes

---
license: mit
language:
- en
metrics:
- T2I-Compbench
- GenEval
- PickScore
- AES
- ImageReward
- HPSV2
new_version: v0.1
pipeline_tag: text-to-image
library_name: diffusers
tags:
- inference-enhanced algorithm
- efficiency
- effectiveness
- generalization
- weak-to-strong guidance
---

# The Official Implementation of our Arxiv 2025 paper:

> **[CoRe^2: _Collect, Reflect and Refine_ to Generate Better and Faster](https://arxiv.org/abs/2503.09662)** <br>

Authors:

>**<em>Shitong Shao, Zikai Zhou, Dian Xie, Yuetong Fang, Tian Ye, Lichen Bai</em> and <em>Zeke Xie*</em>** <br>
> xLeaf Lab, HKUST (GZ) <br>
> *: Corresponding author

## New

- [x] Release the inference code of SD3.5 and SDXL.

- [ ] Release the inference code of FLUX.

- [ ] Release the inference code of LlamaGen.

- [ ] Release the implementation of the Collect phase.

- [ ] Release the implementation of the Reflect phase.


## Overview

This guide provides instructions on how to use the CoRe^2.

Here we provide the inference code which supports different models like ***Stable Diffusion XL, Stable Diffusion 3.5 Large.***

## Requirements

- `python version == 3.8`
- `pytorch with cuda version`
- `diffusers`
- `PIL`
- `bitsandbytes`
- `numpy`
- `timm`
- `argparse`
- `einops`

## Installation🚀️

Make sure you have successfully built `python` environment and installed `pytorch` with cuda version. Before running the script, ensure you have all the required packages installed. You can install them using:

```bash
pip install diffusers, PIL, numpy, timm, argparse, einops
```

## Usage👀️ 

To use the CoRe^2 pipeline, you need to run the `sample_img.py` script with appropriate command-line arguments. Below are the available options:

### Command-Line Arguments

- `--pipeline`: Select the model pipeline (`sdxl`, `sd35`). Default is `sdxl`.
- `--prompt`: The textual prompt based on which the image will be generated. Default is "Mickey Mouse painting by Frank Frazetta."
- `--inference-step`: Number of inference steps for the diffusion process. Default is 50.
- `--cfg`: Classifier-free guidance scale. Default is 5.5.
- `--pretrained-path`: Path to the pretrained model weights. Default is a specified path in the script.
- `--size`: The size (height and width) of the generated image. Default is 1024.
- `--method`: Select the inference method (`standard`, `core`, `zigzag`, `z-core`)

### Running the Script

Run the script from the command line by navigating to the directory containing `sample_img.py` and executing:

```
python sample_img.py --pipeline sdxl --prompt "A banana on the left of an apple." --size 1024
```

This command will generate an image based on the prompt using the Stable Diffusion XL model with an image size of 1024x1024 pixels.

### Output🎉️ 

The script will save one image.