--- tags: - model_hub_mixin - pytorch_model_hub_mixin - art license: mit pipeline_tag: unconditional-image-generation metrics: - name: FID type: image value: 80.4755 dataset: https://www.kaggle.com/datasets/ayhantasyurt/pixel-art-2dgame-charecter-sprites-idle split: test --- # Sprite-flow Flow-based generative model for unguided generation of 128x128 RGBA pixel art characters. ## Model Details ### Model Description - **Developed by:** [Mihailo Radović](https://www.linkedin.com/in/mihailo-radović-484070278/) - **Model type:** Unconditional Image Generation - **License:** MIT ### Model Sources - **Repository:** [GitHub Repo](https://github.com/mradovic38/sprite-flow) - **Demo:** [Gradio App](https://huggingface.co/spaces/mradovic38/sprite-flow) ## Uses ### Direct Use Predicts the vector field for generating 128x128 RGBA pixel art character images from Isotropic Gaussian Distribution by simulating an ODE with Linear Noise Scheduling. ### Out-of-Scope Use Could be used with Cosine or any other Noise scheduler. ## How to Get Started with the Model * Step 1 - **Clone the [GitHub Repo](https://github.com/mradovic38/sprite-flow)** * Step 2 - **Initialize the model**: ```py from models.unet import PixelArtUNet model = PixelArtUNet( channels = [128, 256, 512, 1024], num_residual_layers = 2, t_embed_dim = 128, midcoder_dropout_p=0.2 ).to(device) ``` * Step 3: **Load Model weights**: ```py from huggingface_hub import hf_hub_download from safetensors.torch import load_file repo_id = "mradovic38/sprite-flow" filename = "model.safetensors" file_path = hf_hub_download(repo_id=repo_id, filename=filename) checkpoint = load_file(file_path) model.load_state_dict(checkpoint) model.to(device) model.eval() ``` * Step 4: **Initialize the probability path**: ```py from sampling.conditional_probability_path import GaussianConditionalProbabilityPath from sampling.noise_scheduling import LinearAlpha, LinearBeta path = GaussianConditionalProbabilityPath( p_data=None, p_simple_shape=[4, 128, 128], alpha=LinearAlpha(), beta=LinearBeta() ).to(device) path.eval() ``` * Step 5: **Simulate ODE**: ```py import torch from diff_eq.ode_sde import UnguidedVectorFieldODE from diff_eq.simulator import EulerSimulator num_timesteps = 200 # example number of timesteps num_samples = 3 # example number of samples ts = torch.linspace(0, 1, num_timesteps).view(1, -1, 1, 1, 1).expand(num_samples, -1, 1, 1, 1).to(device) x0 = path.p_simple.sample(num_samples).to(device) # (num_samples, 4, 128, 128) ode = UnguidedVectorFieldODE(model) simulator = EulerSimulator(ode) x1 = simulator.simulate(x0, ts) # (num_samples, 4, 128, 128) ``` * Step 6: **Turn torch tensor to PIL**: ```py from utils.helpers import tensor_to_rgba_image, normalize_to_unit x1 = normalize_to_unit(x1) # [-1, 1] -> [0, 1] imgs = tensor_to_rgba_image(x1) ```