--- title: SyncAI emoji: 🎵 colorFrom: indigo colorTo: pink sdk: gradio sdk_version: "6.8.0" app_file: app.py pinned: false short_description: AI Music Ads Generator --- # SyncAI — AI Music Video Generator Generate beat-synced music video ads from a song clip. Upload ~15 seconds of audio, pick a visual style, and SyncAI produces a fully assembled vertical video with AI-generated visuals cut to the beat. ## How It Works ``` Song (audio file) ├─► Stem Separation (LALAL.AI) → Vocals + Drums ├─► Lyrics Extraction (WhisperX) → Word-level timestamps ├─► Beat Detection (madmom RNN + DBN) → Beat timestamps + drop detection ├─► Segmentation → Lyrics mapped to beat intervals ├─► Prompt Generation (Claude Sonnet 4.6) → Image + video motion prompts ├─► Image Generation (SDXL + Hyper-SD + style LoRA) → 768x1344 images ├─► Image-to-Video (Wan 2.1 14B) → Animated clips └─► Assembly (FFmpeg) → Beat-synced video with lyrics overlay ``` ## Visual Styles Each style applies a different LoRA to SDXL and sets a unique scene world for the LLM prompt generator. The Sunset Coastal Drive LoRA was custom-trained for this project; the others are community LoRAs from HuggingFace Hub: | Style | LoRA | Setting | |-------|------|---------| | **Sunset Coastal Drive** | Custom-trained (`samuelsattler/warm-sunset-lora`) | Car cruising a coastal highway at golden hour | | **Rainy City Night** | Film grain (`artificialguybr/filmgrain-redmond`) | Walking rain-soaked city streets after dark | | **Cyberpunk** | Cyberpunk 2077 (`jbilcke-hf/sdxl-cyberpunk-2077`) | Neon-drenched futuristic megacity at night | | **Watercolour Harbour** | Watercolor (`ostris/watercolor_style_lora_sdxl`) | Coastal fishing village during a storm | ## Assembly Features - **Dynamic pacing**: 4-beat cuts before the drop, 2-beat cuts after for energy - **Clip shuffling**: Each clip used twice (first/second half) in randomised order for visual variety - **Ken Burns**: Alternating zoom in/out on every cut - **Lyrics overlay**: Word-level timing with gap closing - **Cover art overlay**: Album art + Spotify badge appear from the drop onwards - **Reshuffle**: Re-run assembly with a new random clip order without regenerating ## Tech Stack | Component | Tool | |-----------|------| | Stem separation | LALAL.AI API (Andromeda) | | Lyrics (ASR) | WhisperX (large-v2 + wav2vec2) | | Beat detection | madmom (RNN + DBN) | | Prompt generation | Claude Sonnet 4.6 (Anthropic API) | | Image generation | SDXL + Hyper-SD 8-step + style LoRA | | Image-to-video | Wan 2.1 14B (ZeroGPU with FP8) | | Video assembly | FFmpeg | | UI | Gradio |