Alogotron's picture
Upload README.md with huggingface_hub
c26077a verified
# Milady Avatar Adapter (SDXL)
Neural adapter that maps Qwen3-4B language model activations to SDXL prompt embedding space,
enabling real-time emotional avatar generation in the Milady art style.
## Architecture
### SDXL Adapter (NEW - Higher Quality)
- **Input**: Qwen3-4B hidden states from layers [9, 18, 27] β†’ 7680 dims
- **Layer Weighting**: Learned weighted combination β†’ 2560 dims
- **Cross-Attention Decoder**: 3-layer transformer decoder with 8 heads
- **Output**: SDXL prompt embeddings [77, 2048] + pooled embeddings [1280]
- **Parameters**: 5.28M
- **Training**: 500 epochs on 200 emotion-labeled samples, MSE loss
- **Best Val Loss**: 6.762
### Pipeline
```
Emotional Text β†’ Qwen3-4B (hooks on layers 9,18,27) β†’ Adapter β†’ SDXL + Milady LoRA β†’ Avatar Image
```
## Files
### SDXL Version (Recommended)
- `sdxl/best_sdxl_adapter.pt` - Trained adapter weights
- `sdxl/sdxl_adapter.py` - Adapter architecture
- `sdxl/test_sdxl_pipeline.py` - End-to-end inference script
- `sdxl/train_sdxl_adapter.py` - Training script
### Klein Version (Legacy)
- `adapters/` - Original FLUX.2-Klein adapter weights
## Requirements
- SDXL base model: `stabilityai/stable-diffusion-xl-base-1.0`
- Milady LoRA: CivitAI Milady SDXL LoRA
- Qwen3-4B: `Qwen/Qwen3-4B`
- Python packages: `torch`, `transformers`, `diffusers`, `safetensors`
## Emotions Supported
20 emotions: happy, sad, angry, surprised, scared, disgusted, neutral, excited, calm, anxious,
confident, shy, proud, loving, jealous, curious, bored, amused, thoughtful, determined