yashvoladoddi37/kanjienglish
Viewer • Updated • 5.08k • 9 • 2
How to use yashvoladoddi37/kanji-diffusion-v1-4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("yashvoladoddi37/kanji-diffusion-v1-4", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("yashvoladoddi37/kanji-diffusion-v1-4", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Kanji Diffusion is a latent text-to-image diffusion model capable of hallucinating Kanji characters given any English prompt.
In order to run the pipeline and see how my model generates the kanji characters, follow the code flow below on Colab(on T4 GPU runtime, else it takes a long time to infer each image). Make sure you have your Huggingface API KEY / ACCESS TOKEN for this.
import os
from google.colab import drive
drive.mount('/content/drive')
os.chdir("/content/drive/MyDrive")
!pip install diffusers
!git clone https://github.com/huggingface/diffusers
!huggingface-cli login
from diffusers import StableDiffusionPipeline
import torch
torch.cuda.empty_cache()
model_path = "yashvoladoddi37/kanji-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, use_safetensors = True).to("cuda")
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")
prompt = "A Kanji meaning baby robot"
image = pipe(prompt).images[0]
image.save("baby-robot-kanji-v1-4.png")
Training Data
Hardware: Nvidia GTX 1650 4GB vRAM | 8GB RAM and T4 GPU on Colab
Training Script:
!accelerate launch train_text_to_image_lora.py \
--pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" \
--dataset_name="yashvoladoddi37/kanjienglish" \
--image_column = "image"
--caption_column="text" \
--resolution=512 \
--random_flip \
--train_batch_size=1 \
--num_train_epochs=1 \
--checkpointing_steps=500 \
--learning_rate=1e-04 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--seed=42 \
--output_dir="kanji-diffusion-v1-4" \
--validation_prompt="A kanji meaning Elon Musk" \
--push_to_hub