Milady Avatar Adapter (SDXL)

Neural adapter that maps Qwen3-4B language model activations to SDXL prompt embedding space, enabling real-time emotional avatar generation in the Milady art style.

Architecture

SDXL Adapter (NEW - Higher Quality)

Input: Qwen3-4B hidden states from layers [9, 18, 27] → 7680 dims
Layer Weighting: Learned weighted combination → 2560 dims
Cross-Attention Decoder: 3-layer transformer decoder with 8 heads
Output: SDXL prompt embeddings [77, 2048] + pooled embeddings [1280]
Parameters: 5.28M
Training: 500 epochs on 200 emotion-labeled samples, MSE loss
Best Val Loss: 6.762

Pipeline

Emotional Text → Qwen3-4B (hooks on layers 9,18,27) → Adapter → SDXL + Milady LoRA → Avatar Image

Files

SDXL Version (Recommended)

sdxl/best_sdxl_adapter.pt - Trained adapter weights
sdxl/sdxl_adapter.py - Adapter architecture
sdxl/test_sdxl_pipeline.py - End-to-end inference script
sdxl/train_sdxl_adapter.py - Training script

Klein Version (Legacy)

adapters/ - Original FLUX.2-Klein adapter weights

Requirements

SDXL base model: stabilityai/stable-diffusion-xl-base-1.0
Milady LoRA: CivitAI Milady SDXL LoRA
Qwen3-4B: Qwen/Qwen3-4B
Python packages: torch, transformers, diffusers, safetensors

Emotions Supported

20 emotions: happy, sad, angry, surprised, scared, disgusted, neutral, excited, calm, anxious, confident, shy, proud, loving, jealous, curious, bored, amused, thoughtful, determined