covalenthq
/

boredape_diffusion

StableDiffusionPipeline

stable-diffusion

stable-diffusion-diffusers

Model card Files Files and versions

Metrics Training metrics Community

boredape_diffusion / README.md

ckandemir's picture

Update README.md

079aefe over 2 years ago

|

3.01 kB

	---
	license: creativeml-openrail-m
	base_model: runwayml/stable-diffusion-v1-5
	instance_prompt: photo of a bayc nft
	tags:
	- stable-diffusion
	- stable-diffusion-diffusers
	- text-to-image
	- diffusers
	- dreambooth
	inference: true
	pipeline_tag: text-to-image
	---

	# DreamBooth - Bored Ape Yacht Club

	## Model Description

	This DreamBooth model is an exquisite derivative of `runwayml/stable-diffusion-v1-5`, fine-tuned with an engaging emphasis on the Bored Ape Yacht Club (BAYC) NFT collection. The model's weights were meticulously honed using photos from BAYC NFTs, leveraging the innovative [DreamBooth](https://dreambooth.github.io/) technology to curate a unique, text-to-image synthesis experience.


	### Training

	Images instrumental in the model's training were generously sourced from the Covalent API, specifically via this [endpoint](https://www.covalenthq.com/docs/api/nft/get-nft-token-ids-for-contract-with-metadata/).

	### Inference

	Inference has been meticulously optimized, allowing for the generation of captivating, original, and unique images that resonate with the Bored Ape Yacht Club collection. This facilitates a vivid exploration of creativity, enabling the synthesis of images that seamlessly align with the distinctive aesthetics of Bored Ape NFTs.

	![img_0](./image_0.png)
	![img_1](./image_1.png)
	![img_2](./image_2.png)


	## Usage

	Here’s a basic example of how you can wield this model for generating images:

	```python
	import torch
	from diffusers import StableDiffusionPipeline, DDIMScheduler
	from transformers import CLIPTextModel
	import numpy as np

	model_id = "runwayml/stable-diffusion-v1-5"

	unet = UNet2DConditionModel.from_pretrained("ckandemir/bayc-diffusion", subfolder="unet")
	text_encoder = CLIPTextModel.from_pretrained("ckandemir/bayc-diffusion",subfolder="text_encoder")

	pipeline = StableDiffusionPipeline.from_pretrained(
	model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16, use_safetensors=True
	).to('cuda')
	pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

	prompt = ["a spiderman bayc nft"]
	neg_prompt = ["realistic,disfigured face,eye patch,disfigured eyes, disfigured, deformed,bad anatomy"] * len(prompt)
	num_samples = 3
	guidance_scale = 9
	num_inference_steps = 50
	height = 512
	width = 512

	seed = np.random.randint(0, 2**20 - 1)
	print("Seed: {}".format(str(seed)))
	generator = torch.Generator(device='cuda').manual_seed(seed)

	with autocast("cuda"), torch.inference_mode():
	imgs = pipeline(
	prompt,
	negative_prompt=neg_prompt,
	height=height, width=width,
	num_images_per_prompt=num_samples,
	num_inference_steps=num_inference_steps,
	guidance_scale=guidance_scale,
	generator=generator
	).images

	for img in imgs:
	display(img)
	```

	## Optimization
	Results can be further enhanced and refined through meticulous fine-tuning and adept modification of training parameters, unlocking an even broader spectrum of creativity and artistic expression.