---
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
- art
license: mit
pipeline_tag: unconditional-image-generation
metrics:
- name: FID
  type: image
  value: 80.4755
  dataset: https://www.kaggle.com/datasets/ayhantasyurt/pixel-art-2dgame-charecter-sprites-idle
  split: test
---
# Sprite-flow
Flow-based generative model for unguided generation of 128x128 RGBA pixel art characters. 

## Model Details
### Model Description
- **Developed by:** [Mihailo Radović](https://www.linkedin.com/in/mihailo-radović-484070278/)
- **Model type:** Unconditional Image Generation
- **License:** MIT

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** [GitHub Repo](https://github.com/mradovic38/sprite-flow)
- **Demo:** [Gradio App](https://huggingface.co/spaces/mradovic38/sprite-flow)

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use
Predicts the vector field for generating 128x128 RGBA pixel art character images from Isotropic Gaussian Distribution by simulating an ODE with Linear Noise Scheduling.

### Out-of-Scope Use
Could be used with Cosine or any other Noise scheduler.

## How to Get Started with the Model
* Step 1 - **Clone the [GitHub Repo](https://github.com/mradovic38/sprite-flow)**

* Step 2 - **Initialize the model**:
  ```py
  from models.unet import PixelArtUNet
  
  model = PixelArtUNet(
      channels = [128, 256, 512, 1024],
      num_residual_layers = 2,
      t_embed_dim = 128,
      midcoder_dropout_p=0.2
  ).to(device)
  ```
  
* Step 3: **Load Model weights**:
  ```py
  from huggingface_hub import hf_hub_download
  from safetensors.torch import load_file
  
  repo_id = "mradovic38/sprite-flow"
  filename = "model.safetensors"
  file_path = hf_hub_download(repo_id=repo_id, filename=filename)
  checkpoint = load_file(file_path)
  model.load_state_dict(checkpoint)
  model.to(device)
  model.eval()
  ```

* Step 4: **Initialize the probability path**:
  ```py
  from sampling.conditional_probability_path import GaussianConditionalProbabilityPath
  from sampling.noise_scheduling import LinearAlpha, LinearBeta
  
  path = GaussianConditionalProbabilityPath(
      p_data=None,
      p_simple_shape=[4, 128, 128],
      alpha=LinearAlpha(),
      beta=LinearBeta()
  ).to(device)
  path.eval()
  ```

* Step 5: **Simulate ODE**:

  ```py
  import torch
  
  from diff_eq.ode_sde import UnguidedVectorFieldODE
  from diff_eq.simulator import EulerSimulator
  
  num_timesteps = 200 # example number of timesteps
  num_samples = 3 # example number of samples
  
  ts = torch.linspace(0, 1, num_timesteps).view(1, -1, 1, 1, 1).expand(num_samples, -1, 1, 1, 1).to(device)
  x0 = path.p_simple.sample(num_samples).to(device)  # (num_samples, 4, 128, 128)
  ode = UnguidedVectorFieldODE(model)
  simulator = EulerSimulator(ode)
  x1 = simulator.simulate(x0, ts)  # (num_samples, 4, 128, 128)
  ```

* Step 6: **Turn torch tensor to PIL**: 

  ```py
  from utils.helpers import tensor_to_rgba_image, normalize_to_unit
  
  x1 = normalize_to_unit(x1) # [-1, 1] -> [0, 1]
  imgs = tensor_to_rgba_image(x1)
  ```