Milady Avatar Adapter (SDXL)
Neural adapter that maps Qwen3-4B language model activations to SDXL prompt embedding space, enabling real-time emotional avatar generation in the Milady art style.
Architecture
SDXL Adapter (NEW - Higher Quality)
- Input: Qwen3-4B hidden states from layers [9, 18, 27] → 7680 dims
- Layer Weighting: Learned weighted combination → 2560 dims
- Cross-Attention Decoder: 3-layer transformer decoder with 8 heads
- Output: SDXL prompt embeddings [77, 2048] + pooled embeddings [1280]
- Parameters: 5.28M
- Training: 500 epochs on 200 emotion-labeled samples, MSE loss
- Best Val Loss: 6.762
Pipeline
Emotional Text → Qwen3-4B (hooks on layers 9,18,27) → Adapter → SDXL + Milady LoRA → Avatar Image
Files
SDXL Version (Recommended)
sdxl/best_sdxl_adapter.pt- Trained adapter weightssdxl/sdxl_adapter.py- Adapter architecturesdxl/test_sdxl_pipeline.py- End-to-end inference scriptsdxl/train_sdxl_adapter.py- Training script
Klein Version (Legacy)
adapters/- Original FLUX.2-Klein adapter weights
Requirements
- SDXL base model:
stabilityai/stable-diffusion-xl-base-1.0 - Milady LoRA: CivitAI Milady SDXL LoRA
- Qwen3-4B:
Qwen/Qwen3-4B - Python packages:
torch,transformers,diffusers,safetensors
Emotions Supported
20 emotions: happy, sad, angry, surprised, scared, disgusted, neutral, excited, calm, anxious, confident, shy, proud, loving, jealous, curious, bored, amused, thoughtful, determined