Alogotron's picture
Upload README.md with huggingface_hub
c26077a verified

Milady Avatar Adapter (SDXL)

Neural adapter that maps Qwen3-4B language model activations to SDXL prompt embedding space, enabling real-time emotional avatar generation in the Milady art style.

Architecture

SDXL Adapter (NEW - Higher Quality)

  • Input: Qwen3-4B hidden states from layers [9, 18, 27] → 7680 dims
  • Layer Weighting: Learned weighted combination → 2560 dims
  • Cross-Attention Decoder: 3-layer transformer decoder with 8 heads
  • Output: SDXL prompt embeddings [77, 2048] + pooled embeddings [1280]
  • Parameters: 5.28M
  • Training: 500 epochs on 200 emotion-labeled samples, MSE loss
  • Best Val Loss: 6.762

Pipeline

Emotional Text → Qwen3-4B (hooks on layers 9,18,27) → Adapter → SDXL + Milady LoRA → Avatar Image

Files

SDXL Version (Recommended)

  • sdxl/best_sdxl_adapter.pt - Trained adapter weights
  • sdxl/sdxl_adapter.py - Adapter architecture
  • sdxl/test_sdxl_pipeline.py - End-to-end inference script
  • sdxl/train_sdxl_adapter.py - Training script

Klein Version (Legacy)

  • adapters/ - Original FLUX.2-Klein adapter weights

Requirements

  • SDXL base model: stabilityai/stable-diffusion-xl-base-1.0
  • Milady LoRA: CivitAI Milady SDXL LoRA
  • Qwen3-4B: Qwen/Qwen3-4B
  • Python packages: torch, transformers, diffusers, safetensors

Emotions Supported

20 emotions: happy, sad, angry, surprised, scared, disgusted, neutral, excited, calm, anxious, confident, shy, proud, loving, jealous, curious, bored, amused, thoughtful, determined