LoRA-Encoder-FLUX.1-Dev / README_from_modelscope.md

kelseye

Upload folder using huggingface_hub

e062867 verified 3 months ago

preview code

raw

history blame contribute delete

3.99 kB

metadata

frameworks:
  - Pytorch
license: Apache License 2.0
tasks:
  - text-to-image-synthesis

LoRA 编码器（FLUX.1-Dev）

本模型可以将 FLUX 模型的 LoRA 模型编码为 Embedding 向量，激发出 LoRA 模型的能力。

以 LoRA 模型 VoidOc/F.1_动物森友会LoRA 为例，LoRA 编码器有以下几种使用方法。

使用方法1：LoRA 用途推断

给定一个 LoRA 模型，在没有任何额外信息的条件下，使用空提示词可以直接激发 LoRA 模型的能力，进而推断出 LoRA 的用途。

提示词：""

不使用 LoRA 编码器	使用 LoRA 编码器

使用方法2：免触发词激发 LoRA 能力

无需填写触发词，即可自动激发 LoRA 的能力。

提示词："a car"

不使用 LoRA 编码器	使用 LoRA 编码器

使用方法3：LoRA 强度控制

我们预留了一个额外的参数 scale，控制 LoRA 对模型生成图像的影响大小。

在下面的例子中，提示词为“a cat”，当 scale=1 时，LoRA 强度为最大，画面中生成了动物森友会中的角色和一只猫；当 scale=0.5 时，LoRA 强度被减弱，画面中生成了动物森友会中的猫猫角色。scale 的最优数值与 LoRA 模型本身有关，我们建议在角色 LoRA 上使用较大的数值，在风格 LoRA 上使用较小的数值。

提示词："a cat"

`scale=1`	`scale=0.5`

推理代码

git clone https://github.com/modelscope/DiffSynth-Studio.git  
cd DiffSynth-Studio
pip install -e .

import torch
from diffsynth.pipelines.flux_image_new import FluxImagePipeline, ModelConfig


pipe = FluxImagePipeline.from_pretrained(
    torch_dtype=torch.bfloat16,
    device="cuda",
    model_configs=[
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="flux1-dev.safetensors"),
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="text_encoder/model.safetensors"),
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="text_encoder_2/"),
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="ae.safetensors"),
        ModelConfig(model_id="DiffSynth-Studio/LoRA-Encoder-FLUX.1-Dev", origin_file_pattern="model.safetensors"),
    ],
)
pipe.enable_lora_magic()

lora = ModelConfig(model_id="VoidOc/flux_animal_forest1", origin_file_pattern="20.safetensors")
pipe.load_lora(pipe.dit, lora, hotload=True) # Use `pipe.clear_lora()` to drop the loaded LoRA.

# Empty prompt can automatically activate LoRA capabilities.
image = pipe(prompt="", seed=0, lora_encoder_inputs=lora)
image.save("image_1.jpg")

image = pipe(prompt="", seed=0)
image.save("image_1_origin.jpg")

# Prompt without trigger words can also activate LoRA capabilities.
image = pipe(prompt="a car", seed=0, lora_encoder_inputs=lora)
image.save("image_2.jpg")

image = pipe(prompt="a car", seed=0,)
image.save("image_2_origin.jpg")

# Adjust the activation intensity through the scale parameter.
image = pipe(prompt="a cat", seed=0, lora_encoder_inputs=lora, lora_encoder_scale=1.0)
image.save("image_3.jpg")

image = pipe(prompt="a cat", seed=0, lora_encoder_inputs=lora, lora_encoder_scale=0.5)
image.save("image_3_scale.jpg")