mradovic38
/

sprite-flow

Unconditional Image Generation

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions

sprite-flow / README.md

mradovic38's picture

Update README.md

4f7cb81 verified 8 months ago

|

history blame contribute delete

3.2 kB

	---
	tags:
	- model_hub_mixin
	- pytorch_model_hub_mixin
	- art
	license: mit
	pipeline_tag: unconditional-image-generation
	metrics:
	- name: FID
	type: image
	value: 80.4755
	dataset: https://www.kaggle.com/datasets/ayhantasyurt/pixel-art-2dgame-charecter-sprites-idle
	split: test
	---
	# Sprite-flow
	Flow-based generative model for unguided generation of 128x128 RGBA pixel art characters.

	## Model Details
	### Model Description
	- Developed by: [Mihailo Radović](https://www.linkedin.com/in/mihailo-radović-484070278/)
	- Model type: Unconditional Image Generation
	- License: MIT

	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: [GitHub Repo](https://github.com/mradovic38/sprite-flow)
	- Demo: [Gradio App](https://huggingface.co/spaces/mradovic38/sprite-flow)

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	### Direct Use
	Predicts the vector field for generating 128x128 RGBA pixel art character images from Isotropic Gaussian Distribution by simulating an ODE with Linear Noise Scheduling.

	### Out-of-Scope Use
	Could be used with Cosine or any other Noise scheduler.

	## How to Get Started with the Model
	* Step 1 - Clone the [GitHub Repo](https://github.com/mradovic38/sprite-flow)

	* Step 2 - Initialize the model:
	```py
	from models.unet import PixelArtUNet

	model = PixelArtUNet(
	channels = [128, 256, 512, 1024],
	num_residual_layers = 2,
	t_embed_dim = 128,
	midcoder_dropout_p=0.2
	).to(device)
	```

	* Step 3: Load Model weights:
	```py
	from huggingface_hub import hf_hub_download
	from safetensors.torch import load_file

	repo_id = "mradovic38/sprite-flow"
	filename = "model.safetensors"
	file_path = hf_hub_download(repo_id=repo_id, filename=filename)
	checkpoint = load_file(file_path)
	model.load_state_dict(checkpoint)
	model.to(device)
	model.eval()
	```

	* Step 4: Initialize the probability path:
	```py
	from sampling.conditional_probability_path import GaussianConditionalProbabilityPath
	from sampling.noise_scheduling import LinearAlpha, LinearBeta

	path = GaussianConditionalProbabilityPath(
	p_data=None,
	p_simple_shape=[4, 128, 128],
	alpha=LinearAlpha(),
	beta=LinearBeta()
	).to(device)
	path.eval()
	```

	* Step 5: Simulate ODE:

	```py
	import torch

	from diff_eq.ode_sde import UnguidedVectorFieldODE
	from diff_eq.simulator import EulerSimulator

	num_timesteps = 200 # example number of timesteps
	num_samples = 3 # example number of samples

	ts = torch.linspace(0, 1, num_timesteps).view(1, -1, 1, 1, 1).expand(num_samples, -1, 1, 1, 1).to(device)
	x0 = path.p_simple.sample(num_samples).to(device) # (num_samples, 4, 128, 128)
	ode = UnguidedVectorFieldODE(model)
	simulator = EulerSimulator(ode)
	x1 = simulator.simulate(x0, ts) # (num_samples, 4, 128, 128)
	```

	* Step 6: Turn torch tensor to PIL:

	```py
	from utils.helpers import tensor_to_rgba_image, normalize_to_unit

	x1 = normalize_to_unit(x1) # [-1, 1] -> [0, 1]
	imgs = tensor_to_rgba_image(x1)
	```