Dojo Cat Lora Flux NF4

Prompt
Training With QLoRA: This close-up photograph features Dojo Cat, showcasing her striking pink hair and makeup. Her skin is framed by long, straight pink hair, and her eyes are accentuated by thick lashes and dark eyebrows. A light pink lip gloss and blush complete her look. A delicate gold chain necklace with small beads adorns her neck. The background is a simple, solid beige, keeping the focus entirely on Dojo Cat's captivating features.
Prompt
Training Without QLoRA: This close-up photograph features Dojo Cat, showcasing her striking pink hair and makeup. Her skin is framed by long, straight pink hair, and her eyes are accentuated by thick lashes and dark eyebrows. A light pink lip gloss and blush complete her look. A delicate gold chain necklace with small beads adorns her neck. The background is a simple, solid beige, keeping the focus entirely on Dojo Cat's captivating features.
Prompt
Testing With QLoRA: Dojo Cat posing in fox ears and an orange dress, magical fantasy cosplay, a luxurious medieval costume, exotic street style elements, a furry art aesthetic, cozy evening light, outdoor background, shot with a canon eos r6. detailed facial features, warm tones, photorealistic, cinematic lighting. --v 6.1
Prompt
Testing Without QLoRA: Dojo Cat posing in fox ears and an orange dress, magical fantasy cosplay, a luxurious medieval costume, exotic street style elements, a furry art aesthetic, cozy evening light, outdoor background, shot with a canon eos r6. detailed facial features, warm tones, photorealistic, cinematic lighting. --v 6.1

All files are also archived in https://github.com/je-suis-tm/huggingface-archive in case this gets censored.

The QLoRA fine-tuning process of dojo_cat_lora_flux_nf4 takes inspiration from this post (https://huggingface.co/blog/flux-qlora). The training was executed on a local computer with 1000 timesteps and the same parameters as the link mentioned above, which took around 6 hours on 8GB VRAM 4060. The peak VRAM usage was around 7.7GB. To avoid running low on VRAM, both transformers and text_encoder were quantized. All the images generated here are using the below parameters

  • Height: 512
  • Width: 512
  • Guidance scale: 5
  • Num inference steps: 20
  • Max sequence length: 512
  • Seed: 0

Usage

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from transformers import T5EncoderModel

text_encoder_4bit = T5EncoderModel.from_pretrained(
    "hf-internal-testing/flux.1-dev-nf4-pkg", subfolder="text_encoder_2",torch_dtype=torch.float16,)

transformer_4bit = FluxTransformer2DModel.from_pretrained(
        "hf-internal-testing/flux.1-dev-nf4-pkg", subfolder="transformer",torch_dtype=torch.float16,)

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.float16,
                                    transformer=transformer_4bit,text_encoder_2=text_encoder_4bit)

pipe.load_lora_weights("je-suis-tm/dojo_cat_lora_flux_nf4",
                       weight_name='pytorch_lora_weights.safetensors')

prompt="Dojo Cat posing in fox ears and an orange dress, magical fantasy cosplay, a luxurious medieval costume, exotic street style elements, a furry art aesthetic, cozy evening light, outdoor background, shot with a canon eos r6. detailed facial features, warm tones, photorealistic, cinematic lighting. --v 6.1"

image = pipe(
            prompt,
            height=512,
            width=512,
            guidance_scale=5,
            num_inference_steps=20,
            max_sequence_length=512,
            generator=torch.Generator("cpu").manual_seed(0),            
        ).images[0]

image.save("dojo_cat_lora_flux_nf4.png")

Trigger words

You should use Dojo Cat to trigger the image generation.

Download model

Download them in the Files & versions tab.

Downloads last month
4
Inference Providers NEW
Examples

Model tree for je-suis-tm/dojo_cat_lora_flux_nf4

Adapter
(36812)
this model

Dataset used to train je-suis-tm/dojo_cat_lora_flux_nf4

Collection including je-suis-tm/dojo_cat_lora_flux_nf4