ControlNet LoRA SDXL - Brightness Control (100k @ 1024×1024)
A Control LoRA model trained on Stable Diffusion XL to control image generation through brightness/grayscale information. This model uses LoRA (Low-Rank Adaptation) combined with ControlNet architecture for efficient control, providing an ultra-lightweight alternative to full ControlNet with excellent pattern preservation.
Model Description
This Control LoRA enables brightness-based conditioning for SDXL image generation. By providing a grayscale image as input, you can control the brightness distribution and lighting structure while maintaining creative freedom through text prompts.
Key Features:
- 🎨 Excellent brightness and pattern control across multiple scales (0.5-2.0)
- 🚀 196x smaller than full ControlNet: ~24MB vs ~4.7GB
- ⚡ Ultra-fast loading: LoRA weights load in <1 second
- 💡 Flexible scale control: Adjustable conditioning scale from 0.5 to 2.0+
- 🔄 Compatible with ControlLoRA v3: Uses the efficient ControlLoRA v3 architecture
- 📦 Minimal storage: All checkpoints + final model = ~490MB total
- 🖼️ Native SDXL resolution: Trained at 1024×1024
- 🎯 Production-scale training: 100,000 samples with PiSSA initialization
Intended Uses:
- Artistic QR code generation (scale 1.0-1.5 recommended)
- Image recoloring and colorization
- Lighting control in text-to-image generation
- Brightness-based pattern integration
- Watermark and subtle pattern embedding
- Photo enhancement and stylization
Training Details
Training Data
Trained on 100,000 samples from latentcat/grayscale_image_aesthetic_3M:
- High-quality aesthetic images
- Paired with grayscale/brightness versions
- Native resolution: 1024×1024 (SDXL native)
Training Configuration
| Parameter | Value |
|---|---|
| Base Model | stabilityai/stable-diffusion-xl-base-1.0 |
| VAE | madebyollin/sdxl-vae-fp16-fix (improved stability) |
| Architecture | ControlLoRA v3 (~7M trainable parameters) |
| LoRA Rank | 16 |
| Extra Conv Rank | 64 (conv_in layer) |
| Training Resolution | 1024×1024 |
| Training Steps | 3,125 (1 epoch) |
| Batch Size | 8 per device |
| Gradient Accumulation | 4 (effective batch: 32) |
| Learning Rate | 1e-4 constant (no decay) |
| LR Warmup | 0 steps |
| Empty Prompts | 20% (improved from 10k's 10%) |
| Init Method | PiSSA (niter=4) for faster convergence |
| Mixed Precision | BF16 |
| Hardware | NVIDIA H100 80GB |
| Training Time | ~4.5 hours |
| Final Loss | ~0.05 |
Model Size Comparison
| Model | Parameters | Size | Training | Resolution |
|---|---|---|---|---|
| This Control LoRA | ~7M | ~24MB | 100k @ 1024 | 1024×1024 |
| ControlNet (SDXL) | ~700M | 4.7GB | 100k @ 512 | 512×512 |
| T2I Adapter (SDXL) | ~77M | 302MB | 100k @ 1024 | 1024×1024 |
| Flux Control LoRA | ~7M | 25MB | 10k @ 512 | 512×512 |
Usage
Installation
pip install diffusers transformers accelerate torch peft
# Install ControlLoRA v3
git clone https://github.com/HighCWu/control-lora-v3
Basic Usage
import torch
import sys
sys.path.insert(0, '/path/to/control-lora-v3')
from pipeline_sdxl import StableDiffusionXLControlLoraV3Pipeline
from model import UNet2DConditionModelEx
from diffusers import AutoencoderKL
from PIL import Image
# Load improved VAE (same as used in training)
vae = AutoencoderKL.from_pretrained(
"madebyollin/sdxl-vae-fp16-fix",
torch_dtype=torch.float16,
)
# Load UNet with LoRA support
unet = UNet2DConditionModelEx.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
subfolder="unet",
torch_dtype=torch.bfloat16,
)
unet = unet.add_extra_conditions(["brightness"])
# Load SDXL Control LoRA pipeline with improved VAE
pipe = StableDiffusionXLControlLoraV3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
vae=vae,
unet=unet,
torch_dtype=torch.bfloat16,
)
# Load Control LoRA weights
pipe.load_lora_weights("Oysiyl/controlnet-lora-brightness-sdxl", adapter_name="brightness")
pipe.to("cuda")
# Load grayscale/brightness control image
control_image = Image.open("path/to/grayscale_image.png")
control_image = control_image.resize((1024, 1024))
# Generate image
prompt = "a beautiful garden scene with colorful flowers and butterflies, highly detailed, professional photography, vibrant colors"
image = pipe(
prompt=prompt,
image=control_image,
num_inference_steps=30,
guidance_scale=7.5,
extra_condition_scale=1.0, # Controls conditioning strength
height=1024,
width=1024,
).images[0]
image.save("output.png")
Adjusting Control Strength
The extra_condition_scale parameter controls how strongly the brightness map influences generation:
# Subtle control (scale 0.5-0.7)
image = pipe(
prompt=prompt,
image=control_image,
extra_condition_scale=0.5,
...
).images[0]
# Balanced control (scale 1.0-1.5) - Recommended for artistic QR codes
image = pipe(
prompt=prompt,
image=control_image,
extra_condition_scale=1.0,
...
).images[0]
# Strong control (scale 1.5-2.0)
image = pipe(
prompt=prompt,
image=control_image,
extra_condition_scale=1.5,
...
).images[0]
Artistic QR Code Generation
import qrcode
from PIL import Image
# Generate QR code
qr = qrcode.QRCode(
version=1,
error_correction=qrcode.constants.ERROR_CORRECT_H,
box_size=10,
border=4
)
qr.add_data("https://your-url.com")
qr.make(fit=True)
qr_image = qr.make_image(fill_color="black", back_color="white")
qr_image = qr_image.resize((1024, 1024), Image.LANCZOS).convert("RGB")
# Generate artistic QR code (scale 1.0-1.5 works best)
image = pipe(
prompt="a beautiful garden with colorful flowers and butterflies, highly detailed, professional photography",
image=qr_image,
num_inference_steps=30,
guidance_scale=7.5,
extra_condition_scale=1.0,
height=1024,
width=1024,
).images[0]
image.save("artistic_qr.png")
Using Different Checkpoints
The model includes intermediate checkpoints from throughout training:
# Early checkpoint (25% - 25,000 samples)
pipe.load_lora_weights("Oysiyl/controlnet-lora-brightness-sdxl",
adapter_name="brightness",
subfolder="checkpoint-781")
# Mid checkpoint (50% - 50,000 samples)
pipe.load_lora_weights("Oysiyl/controlnet-lora-brightness-sdxl",
adapter_name="brightness",
subfolder="checkpoint-1562")
# Late checkpoint (75% - 75,000 samples)
pipe.load_lora_weights("Oysiyl/controlnet-lora-brightness-sdxl",
adapter_name="brightness",
subfolder="checkpoint-2343")
# Near-final checkpoint (99% - 99,968 samples)
pipe.load_lora_weights("Oysiyl/controlnet-lora-brightness-sdxl",
adapter_name="brightness",
subfolder="checkpoint-3124")
# Final model (100,000 samples, main branch - recommended)
pipe.load_lora_weights("Oysiyl/controlnet-lora-brightness-sdxl",
adapter_name="brightness")
Conditioning Scale Guide
The extra_condition_scale parameter controls how strongly the brightness map influences generation:
Recommended Scale Ranges
| Scale | Behavior | Best For |
|---|---|---|
| 0.5-0.7 | Subtle artistic integration with hints of pattern | Natural images, soft lighting hints |
| 0.7-1.0 | Light control - visible structure with artistic freedom | Artistic images, creative reinterpretation |
| 1.0-1.5 | 🔥 Balanced control | Artistic QR codes, watermarks (recommended) |
| 1.5-2.0 | Strong control - clear patterns with artistic overlay | Geometric patterns, structured designs |
| 2.0+ | Maximum control - dominant patterns | Strong brightness maps, technical applications |
Performance Comparison
vs Full ControlNet (SDXL)
| Metric | ControlNet (SDXL) | This Control LoRA | Advantage |
|---|---|---|---|
| Parameters | ~700M | ~7M | 100x smaller |
| Model Size | 4.7GB | 24MB | 196x smaller |
| Load Time | ~5-10 seconds | <1 second | 10x faster loading |
| Storage (w/ checkpoints) | ~18.8GB | ~490MB | 38x less storage |
| Training Time | ~3 hours | 4.5 hours | Comparable |
| Pattern Preservation @ Scale 1.0 | Excellent | Excellent | Comparable quality |
| Flexibility | Fixed architecture | Adjustable weights | More versatile |
vs T2I Adapter (SDXL)
| Metric | T2I Adapter (SDXL) | This Control LoRA | Advantage |
|---|---|---|---|
| Parameters | ~77M | ~7M | 11x smaller |
| Model Size | 302MB | 24MB | 12.6x smaller |
| Training Samples | 100k | 100k | Matched data |
| Architecture | Separate adapter | Integrated LoRA | Simpler loading |
Checkpoint Progression Analysis
The model includes checkpoints from throughout training:
- checkpoint-781: 25% complete (25,000 samples)
- checkpoint-1562: 50% complete (50,000 samples)
- checkpoint-2343: 75% complete (75,000 samples)
- checkpoint-3124: 99% complete (99,968 samples)
- Final model: 100% complete (100,000 samples - main branch)
Visual Comparison
Each comparison shows QR input + all 7 conditioning scales (0.25, 0.5, 0.7, 0.75, 1.0, 1.25, 1.5) for a specific checkpoint:
Checkpoint 781 (25% trained, 25,000 samples)
Checkpoint 1562 (50% trained, 50,000 samples)
Checkpoint 2343 (75% trained, 75,000 samples)
Checkpoint 3124 (99% trained, 99,968 samples)
Final Model (100% trained, 100,000 samples) - Recommended
Key Observations
All checkpoints show consistent, high-quality performance across scales. The progression analysis reveals:
Early Checkpoint (781 steps, 25k samples):
- Strong pattern awareness from PiSSA initialization
- Good balance between control and creativity
- Recommended scales: 0.7-1.2
Mid Checkpoints (1562-2343 steps, 50k-75k samples):
- Excellent balance between control and creativity
- Stable pattern preservation across all scales
- Recommended scales: 0.8-1.5
Final Model (3125 steps, 100k samples):
- Maximum control capability with best generalization
- Excellent pattern preservation at all scales
- Recommended scales: 0.7-2.0
- Recommended for production use
100k Training - What's Different?
Training on 100k samples (vs 10k) provides several key improvements:
Enhanced Capabilities:
- 🎯 10x More Data: Robust pattern learning across diverse conditions
- 🎨 Better Generalization: Handles wider variety of brightness patterns
- 💪 Improved Stability: More consistent results at extreme scales
- 🚀 Smoother Control: Finer-grained control across the full scale range
- ⚡ Advanced Init: PiSSA initialization for faster convergence
Training Improvements:
- Empty Prompts: 20% (vs 10% in 10k) for better unconditional generation
- VAE: madebyollin/sdxl-vae-fp16-fix for numerical stability
- Scheduler: Constant LR with no warmup for consistent learning
Quality:
- No overfitting observed (LoRA architecture prevents overfit)
- All checkpoints show excellent quality
- Recommended: Final model (100k) for production use
⚠️ Current Status: Artistic QR Code Generation
Best Visual Results: The final 100k checkpoint produces excellent artistic images with beautiful integration of patterns and prompts.
Scanability Issue (Work in Progress): Currently, QR codes generated with this model are not scannable. The model prioritizes artistic quality and prompt following over QR code structure preservation.
Example Output (Final Checkpoint, Scale 0.45):
This beautiful garden scene with flowers and butterflies demonstrates the model's excellent artistic capabilities and prompt following at conditioning scale 0.45, but the QR pattern is not preserved enough for scanning.
What's Working:
- ✅ Excellent artistic quality
- ✅ Beautiful prompt following (garden, flowers, butterflies)
- ✅ Natural integration of brightness patterns
- ✅ Stable training (no overfitting)
What Needs Improvement:
- ❌ QR codes are not scannable
- 🔧 Need to increase conditioning scale or adjust training approach
- 🔧 Possible solutions: higher scales (1.5-2.0), multi-pass refinement, or specialized training
Recommended for: Artistic image generation with brightness control, pattern-guided art. Not recommended yet for functional QR code generation.
Next Steps:
- Experiment with higher conditioning scales (1.5-2.0)
- Test multi-pass refinement approach
- Consider training with stronger structural loss
When to Use This Model
✅ Use This Control LoRA When:
- Creating artistic QR codes with SDXL quality (scale 1.0-1.5)
- Need minimal storage overhead (~25MB per checkpoint)
- Want fast model loading (<1 second)
- Building production applications requiring small model sizes
- Working with SDXL as base model
- Require flexible control strength via extra_condition_scale
- Need multiple checkpoints without massive storage (490MB total vs 18.8GB)
- Working with production-scale datasets (100k samples)
⚠️ Consider Alternatives When:
- Need full ControlNet features with extremely precise control
- Working with existing T2I Adapter pipelines
- Require different control types (pose, depth, etc.) - train separate LoRAs
Limitations
Current Limitations
- ControlLoRA v3 dependency: Requires custom pipeline code (not in main diffusers yet)
- Grayscale conditioning only: Trained specifically for brightness/grayscale control
- Single control type: Only brightness, not other conditioning types
- Custom code required: Need to include ControlLoRA v3 files
Recommendations
- For SDXL generation, use this Control LoRA
- For multiple control types, train separate LoRAs and combine
- Experiment with scales 1.0-1.5 for most use cases
- Use final model for best results
Training Script
accelerate launch --mixed_precision="bf16" train_sdxl.py \
--pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
--pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
--dataset_name="<path_to_100k_dataset>" \
--conditioning_image_column="conditioning_image" \
--image_column="image" \
--caption_column="text" \
--output_dir="./controlnet-lora-brightness-sdxl-100k" \
--mixed_precision="bf16" \
--resolution=1024 \
--learning_rate=1e-4 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--proportion_empty_prompts=0.2 \
--rank=16 \
--lora_adapter_name="brightness" \
--extra_lora_rank_modules conv_in \
--extra_lora_ranks 64 \
--half_or_full_lora=half_skip_attn \
--train_batch_size=8 \
--num_train_epochs=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--checkpointing_steps=781 \
--validation_steps=781 \
--validation_image="validation_qr.png" \
--validation_prompt="a beautiful garden scene with colorful flowers and butterflies, highly detailed, professional photography, vibrant colors" \
--num_validation_images=4 \
--seed=42 \
--dataloader_num_workers=4 \
--tracker_project_name="controlnet-lora-brightness-sdxl-100k" \
--report_to="wandb" \
--enable_xformers_memory_efficient_attention \
--use_8bit_adam \
--init_lora_weights="pissa_niter_4"
Available Checkpoints
All checkpoints are available in the main branch:
- Root directory: Final model (100,000 samples, recommended)
- checkpoint-781/: Early checkpoint (25,000 samples, 25% trained)
- checkpoint-1562/: Mid checkpoint (50,000 samples, 50% trained)
- checkpoint-2343/: Late checkpoint (75,000 samples, 75% trained)
- checkpoint-3124/: Near-final checkpoint (99,968 samples, 99% trained)
Citation
@misc{controlnet-lora-brightness-sdxl,
author = {Oysiyl},
title = {ControlNet LoRA SDXL - Brightness Control (100k @ 1024×1024)},
year = {2026},
publisher = {HuggingFace},
journal = {HuggingFace Model Hub},
howpublished = {\url{https://huggingface.co/Oysiyl/controlnet-lora-brightness-sdxl}}
}
Acknowledgments
- Built with 🤗 Diffusers
- Base model: Stable Diffusion XL by Stability AI
- ControlLoRA v3: control-lora-v3 by HighCWu
- Dataset: grayscale_image_aesthetic_3M by latentcat
- Training infrastructure: NVIDIA H100 80GB
- LoRA implementation: PEFT by Hugging Face
- VAE: madebyollin/sdxl-vae-fp16-fix by madebyollin
License
Apache 2.0 License. The base SDXL model has separate license terms at stabilityai/stable-diffusion-xl-base-1.0.
- Downloads last month
- 41
Model tree for Oysiyl/controlnet-lora-brightness-sdxl
Base model
stabilityai/stable-diffusion-xl-base-1.0




