File size: 3,199 Bytes
4472c03 4f7cb81 4472c03 4f7cb81 4472c03 4f7cb81 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | ---
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
- art
license: mit
pipeline_tag: unconditional-image-generation
metrics:
- name: FID
type: image
value: 80.4755
dataset: https://www.kaggle.com/datasets/ayhantasyurt/pixel-art-2dgame-charecter-sprites-idle
split: test
---
# Sprite-flow
Flow-based generative model for unguided generation of 128x128 RGBA pixel art characters.
## Model Details
### Model Description
- **Developed by:** [Mihailo Radović](https://www.linkedin.com/in/mihailo-radović-484070278/)
- **Model type:** Unconditional Image Generation
- **License:** MIT
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** [GitHub Repo](https://github.com/mradovic38/sprite-flow)
- **Demo:** [Gradio App](https://huggingface.co/spaces/mradovic38/sprite-flow)
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
Predicts the vector field for generating 128x128 RGBA pixel art character images from Isotropic Gaussian Distribution by simulating an ODE with Linear Noise Scheduling.
### Out-of-Scope Use
Could be used with Cosine or any other Noise scheduler.
## How to Get Started with the Model
* Step 1 - **Clone the [GitHub Repo](https://github.com/mradovic38/sprite-flow)**
* Step 2 - **Initialize the model**:
```py
from models.unet import PixelArtUNet
model = PixelArtUNet(
channels = [128, 256, 512, 1024],
num_residual_layers = 2,
t_embed_dim = 128,
midcoder_dropout_p=0.2
).to(device)
```
* Step 3: **Load Model weights**:
```py
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
repo_id = "mradovic38/sprite-flow"
filename = "model.safetensors"
file_path = hf_hub_download(repo_id=repo_id, filename=filename)
checkpoint = load_file(file_path)
model.load_state_dict(checkpoint)
model.to(device)
model.eval()
```
* Step 4: **Initialize the probability path**:
```py
from sampling.conditional_probability_path import GaussianConditionalProbabilityPath
from sampling.noise_scheduling import LinearAlpha, LinearBeta
path = GaussianConditionalProbabilityPath(
p_data=None,
p_simple_shape=[4, 128, 128],
alpha=LinearAlpha(),
beta=LinearBeta()
).to(device)
path.eval()
```
* Step 5: **Simulate ODE**:
```py
import torch
from diff_eq.ode_sde import UnguidedVectorFieldODE
from diff_eq.simulator import EulerSimulator
num_timesteps = 200 # example number of timesteps
num_samples = 3 # example number of samples
ts = torch.linspace(0, 1, num_timesteps).view(1, -1, 1, 1, 1).expand(num_samples, -1, 1, 1, 1).to(device)
x0 = path.p_simple.sample(num_samples).to(device) # (num_samples, 4, 128, 128)
ode = UnguidedVectorFieldODE(model)
simulator = EulerSimulator(ode)
x1 = simulator.simulate(x0, ts) # (num_samples, 4, 128, 128)
```
* Step 6: **Turn torch tensor to PIL**:
```py
from utils.helpers import tensor_to_rgba_image, normalize_to_unit
x1 = normalize_to_unit(x1) # [-1, 1] -> [0, 1]
imgs = tensor_to_rgba_image(x1)
``` |