|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: diffusers |
|
|
pipeline_tag: text-to-image |
|
|
datasets: |
|
|
- SA1B |
|
|
base_model: jimmycarter/LibreFLUX |
|
|
--- |
|
|
# LibreFLUX-ControlNet |
|
|
 |
|
|
|
|
|
This model/pipeline is the product of my [LibreFlux ControlNet training repo](https://github.com/NeuralVFX/LibreFLUX-ControlNet), which uses [LibreFLUX](https://huggingface.co/jimmycarter/LibreFLUX) as the underlying Transformer model for the ControlNet. For the dataset, I auto labeled 165K images from the SA1B dataset and trained for 1 epoch. I've tested using this ControlNet model as a base for transfer learning to less generic datasets, the results are good! |
|
|
|
|
|
# How does this relate to LibreFLUX? |
|
|
- Base model is [LibreFLUX](https://huggingface.co/jimmycarter/LibreFLUX) |
|
|
- Trained in same non-distilled fashion |
|
|
- Uses Attention Masking |
|
|
- Uses CFG during Inference |
|
|
|
|
|
# Fun Facts |
|
|
- Trained on 165K segmented images from Meta's [SA1B Dataset](https://ai.meta.com/datasets/segment-anything/) |
|
|
- Trained using this repo: [https://github.com/NeuralVFX/LibreFLUX-ControlNet](https://github.com/NeuralVFX/LibreFLUX-ControlNet) |
|
|
- Transformer model used: [https://huggingface.co/jimmycarter/LibreFlux-SimpleTuner](https://huggingface.co/jimmycarter/LibreFlux-SimpleTuner) |
|
|
- Inference code roughly adapted from: [https://github.com/bghira/SimpleTuner](https://github.com/bghira/SimpleTuner) |
|
|
|
|
|
# Compatibility |
|
|
```py |
|
|
pip install -U diffusers==0.32.0 |
|
|
pip install -U "transformers @ git+https://github.com/huggingface/transformers@e15687fffe5c9d20598a19aeab721ae0a7580f8a" |
|
|
``` |
|
|
Low VRAM: |
|
|
```py |
|
|
pip install optimum-quanto |
|
|
``` |
|
|
# Load Pipeline |
|
|
```py |
|
|
import torch |
|
|
from diffusers import DiffusionPipeline |
|
|
|
|
|
model_id = "neuralvfx/LibreFlux-ControlNet" |
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
|
|
dtype = torch.bfloat16 if device == "cuda" else torch.float32 |
|
|
|
|
|
pipe = DiffusionPipeline.from_pretrained( |
|
|
model_id, |
|
|
custom_pipeline=model_id, |
|
|
trust_remote_code=True, |
|
|
torch_dtype=dtype, |
|
|
safety_checker=None |
|
|
).to(device) |
|
|
``` |
|
|
|
|
|
# Inference |
|
|
```py |
|
|
from PIL import Image |
|
|
from torchvision.transforms import ToTensor |
|
|
|
|
|
# Load Control Image |
|
|
cond = Image.open("examples/libre_flux_control_image.png") |
|
|
cond = cond.resize((1024, 1024)) |
|
|
|
|
|
# Convert PIL image to tensor and move to device with correct dtype |
|
|
cond_tensor = ToTensor()(cond)[:3,:,:].to(pipe.device, dtype=pipe.dtype).unsqueeze(0) |
|
|
|
|
|
out = pipe( |
|
|
prompt="many pieces of drift wood spelling libre flux sitting casting shadow on the lumpy sandy beach with foot prints all over it", |
|
|
negative_prompt="blurry", |
|
|
control_image=cond_tensor, # Use the tensor here |
|
|
num_inference_steps=75, |
|
|
guidance_scale=4.0, |
|
|
height =1024, |
|
|
width=1024, |
|
|
controlnet_conditioning_scale=1.0, |
|
|
num_images_per_prompt=1, |
|
|
control_mode=None, |
|
|
generator= torch.Generator().manual_seed(32), |
|
|
return_dict=True, |
|
|
) |
|
|
out.images[0] |
|
|
``` |
|
|
# Load Pipeline ( Low VRAM ) |
|
|
```py |
|
|
import torch |
|
|
from diffusers import DiffusionPipeline |
|
|
from optimum.quanto import freeze, quantize, qint8 |
|
|
|
|
|
model_id = "neuralvfx/LibreFlux-ControlNet" |
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
dtype = torch.bfloat16 if device == "cuda" else torch.float32 |
|
|
|
|
|
pipe = DiffusionPipeline.from_pretrained( |
|
|
model_id, |
|
|
custom_pipeline=model_id, |
|
|
trust_remote_code=True, |
|
|
torch_dtype=dtype, |
|
|
safety_checker=None |
|
|
) |
|
|
|
|
|
quantize( |
|
|
pipe.transformer, |
|
|
weights=qint8, |
|
|
exclude=[ |
|
|
"*.norm", "*.norm1", "*.norm2", "*.norm2_context", |
|
|
"proj_out", "x_embedder", "norm_out", "context_embedder", |
|
|
], |
|
|
) |
|
|
|
|
|
quantize( |
|
|
pipe.controlnet, |
|
|
weights=qint8, |
|
|
exclude=[ |
|
|
"*.norm", "*.norm1", "*.norm2", "*.norm2_context", |
|
|
"proj_out", "x_embedder", "norm_out", "context_embedder", |
|
|
], |
|
|
) |
|
|
freeze(pipe.transformer) |
|
|
freeze(pipe.controlnet) |
|
|
|
|
|
pipe.enable_model_cpu_offload() |
|
|
|
|
|
``` |