Image-Text-to-Image
Diffusers
IP-adapter
SDXL
How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("TheRemixer/Mugen-Jina-IP-Adapter", dtype=torch.bfloat16, device_map="cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]

IP-Adapter using Jina-clip-v2's vision tower, and to be used on Mugen

Quick Notes:

  • 368M parameters
  • Trained on Mugen w/Jina with Jina-clip-v2 + Jina-clip-v2 adapter as the text encoder.
    • However, it also works on Mugen with the original CLIP-G + CLIP-L text encoders. The outputs are slightly different.
  • Trained for 5 epochs with 208,000 unbatched steps in total, at a base resolution 1024x1024.
    • Started showing promising results from the 2nd epoch. I have included the previous epochs in this repo if you are interested in testing them.

Installation:

Usage:

  • See the workflow included in this repo on how to use the IP-adapter.
  • I mainly tested the IP-adapter as a style transfer, however it can do other things too.
  • Try different weighting if the default of 1.0 is too much. The default of 1.0 isn't recommened in all cases.

Credits:


Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TheRemixer/Mugen-Jina-IP-Adapter

Dataset used to train TheRemixer/Mugen-Jina-IP-Adapter

Collection including TheRemixer/Mugen-Jina-IP-Adapter