| # Milady Avatar Adapter (SDXL) | |
| Neural adapter that maps Qwen3-4B language model activations to SDXL prompt embedding space, | |
| enabling real-time emotional avatar generation in the Milady art style. | |
| ## Architecture | |
| ### SDXL Adapter (NEW - Higher Quality) | |
| - **Input**: Qwen3-4B hidden states from layers [9, 18, 27] β 7680 dims | |
| - **Layer Weighting**: Learned weighted combination β 2560 dims | |
| - **Cross-Attention Decoder**: 3-layer transformer decoder with 8 heads | |
| - **Output**: SDXL prompt embeddings [77, 2048] + pooled embeddings [1280] | |
| - **Parameters**: 5.28M | |
| - **Training**: 500 epochs on 200 emotion-labeled samples, MSE loss | |
| - **Best Val Loss**: 6.762 | |
| ### Pipeline | |
| ``` | |
| Emotional Text β Qwen3-4B (hooks on layers 9,18,27) β Adapter β SDXL + Milady LoRA β Avatar Image | |
| ``` | |
| ## Files | |
| ### SDXL Version (Recommended) | |
| - `sdxl/best_sdxl_adapter.pt` - Trained adapter weights | |
| - `sdxl/sdxl_adapter.py` - Adapter architecture | |
| - `sdxl/test_sdxl_pipeline.py` - End-to-end inference script | |
| - `sdxl/train_sdxl_adapter.py` - Training script | |
| ### Klein Version (Legacy) | |
| - `adapters/` - Original FLUX.2-Klein adapter weights | |
| ## Requirements | |
| - SDXL base model: `stabilityai/stable-diffusion-xl-base-1.0` | |
| - Milady LoRA: CivitAI Milady SDXL LoRA | |
| - Qwen3-4B: `Qwen/Qwen3-4B` | |
| - Python packages: `torch`, `transformers`, `diffusers`, `safetensors` | |
| ## Emotions Supported | |
| 20 emotions: happy, sad, angry, surprised, scared, disgusted, neutral, excited, calm, anxious, | |
| confident, shy, proud, loving, jealous, curious, bored, amused, thoughtful, determined | |