How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("mingyi456/Neta-Lumina-DF11", dtype=torch.bfloat16, device_map="cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]

For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

Feel free to request for other models for compression as well (for either the diffusers library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.

How to Use

diffusers

import torch
from diffusers import from diffusers import Lumina2Transformer2DModel, Lumina2Pipeline
from dfloat11 import DFloat11Model
# from transformers.modeling_utils import no_init_weights # for transformers<5.0.0
from transformers.initialization import no_init_weights # for transformers>=5.0.0

with no_init_weights():
    transformer = Lumina2Transformer2DModel.from_config(
        Lumina2Transformer2DModel.load_config(
            "neta-art/Neta-Lumina-diffusers", subfolder="transformer"
        ),
        torch_dtype=torch.bfloat16
    ).to(torch.bfloat16)
DFloat11Model.from_pretrained(
    "mingyi456/Neta-Lumina-DF11",
    device="cpu",
    bfloat16_model=transformer,
)
pipe = Lumina2Pipeline.from_pretrained(
    "neta-art/Neta-Lumina-diffusers",
    transformer=transformer, 
    torch_dtype=torch.bfloat16
)

pipe.enable_model_cpu_offload() 
prompt = "You are an assistant designed to generate anime images based on textual prompts. <Prompt Start> 1girl, beautiful, detailed, high quality"
negative_prompt = "You are an assistant designed to generate images based on textual prompts. <Prompt Start> A low quality, low resolution, ugly, and disgusting image with severe digital artifacts, blur, and noise. The subject is deformed, disfigured, and malformed, with bad anatomy, mutated limbs, and bad proportions. Features distorted, twisted, and unnatural hands and face, with extra or missing fingers. The style is amateurish, poorly drawn, childish, like a flat, unfinished sketch or a cheap, bad CGI render. The composition is bad, cropped, out of frame, and cluttered with text, watermarks, or signatures,"

image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=5,  # Increase guidance scale for better prompt adherence
    num_inference_steps=50,  # Increase inference steps for better quality
    negative_prompt=negative_prompt,
    generator=torch.Generator("cpu").manual_seed(114514)
).images[0]
image.save('image neta-lumina.png')

ComfyUI

Refer to this model instead.

Compression details

This is the pattern_dict for compression:

pattern_dict = {
    r"noise_refiner\.\d+": (
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.to_out.0",
        "feed_forward.linear_1",
        "feed_forward.linear_2",
        "feed_forward.linear_3",
        "norm1.linear"
    ),
    r"context_refiner\.\d+": (
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.to_out.0",
        "feed_forward.linear_1",
        "feed_forward.linear_2",
        "feed_forward.linear_3",
    ),
    r"layers\.\d+": (
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.to_out.0",
        "feed_forward.linear_1",
        "feed_forward.linear_2",
        "feed_forward.linear_3",
        "norm1.linear"
    ),
    r"time_caption_embed\.caption_embedder": (
        "1",
    )
}
Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mingyi456/Neta-Lumina-DF11

Quantized
(4)
this model