Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
A newer version of the Gradio SDK is available:
6.1.0
metadata
title: Music AI Tools
emoji: ๐ต๐ถ
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: mcp_server.py
pinned: false
๐ต Music AI Tools - Fun Audio Processing Playground
A comprehensive demo project showcasing 25+ audio processing tools powered by cutting-edge AI models and traditional audio processing libraries. This playground provides both web-based and MCP (Model Context Protocol) interfaces for exploring audio manipulation, analysis, and creative possibilities.
๐ฏ What's Inside
๐ค AI-Powered Features
- ๐ต Stem Separation using Demucs by Facebook Research
- ๐ค Voice Replacement using Seed-VC on Hugging Face
- ๐ง Music Understanding using Music-Flamingo by NVIDIA
๐๏ธ Audio Processing Capabilities
- โ๏ธ Audio Analysis with Librosa for feature extraction
- ๐ฌ Audio Conversion with FFmpeg for format processing
- ๐ High Performance with GPU acceleration and parallel processing
๐ช Demo Features
Stem Processing Tools
- Stem Separation - Full 4-stem separation (vocals, drums, bass, other)
- Selective Stems - Extract only specific stems to save processing time
- Vocal/Instrumental - Separate vocals from instrumental components
- Karaoke Creation - One-click instrumental track generation
Audio Manipulation Tools
- Pitch Alignment - Shift audio pitch by semitones
- Key Estimation - Estimate musical key using harmonic analysis
- Shift to Key - Shift audio to specific musical key
- Align Songs by Key - Harmonically align multiple tracks
- Time Stretching - Change tempo without affecting pitch
- BPM Alignment - Align two tracks to same BPM
- Medley Creation - Fun vocal/instrumental mixing
Audio Editing Tools
- Audio Cutting - Extract segments between time points
- Mute Windows - Mute specific time ranges with smooth fades
- Extract Segments - Extract multiple segments with joining options
- Trim Audio - Trim from beginning/end with precision
- Insert Section - Insert audio sections at precise positions
- Replace Section - Replace audio segments with crossfades
Analysis & Information Tools
- Audio Information - Get detailed file information
- Music Understanding - AI-powered music analysis
- Song Structure - Identify song sections (verse, chorus, bridge)
- Cutting Points - AI-suggested optimal edit points
- Genre Analysis - Detailed genre and style analysis
Special Features
- Voice Replacement - Replace voice using Seed-VC AI model
- Audio Cleaning - Remove noise (hiss, hum, background)
- YouTube Extraction - Extract audio from YouTube videos
๐ Quick Start
Prerequisites
# Install dependencies
pip install -r requirements.txt
# For GPU acceleration (optional but recommended)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Running the Demo
Web Interface (Recommended for Demo)
python mcp_server.py
Then open http://localhost:7860 in your browser to access the fun playground interface with 25+ tools!
MCP Server Mode
python mcp_server.py --mcp
Run as MCP server for integration with AI assistants and other tools.
๐ฎ Using the Tools
Web Interface
- Upload Audio - Drag & drop or browse for audio files (WAV, MP3, FLAC, M4A)
- Select Tool - Choose from 25+ different audio processing tools
- Configure Settings - Adjust parameters for each tool
- Process & Download - Get results instantly with real-time progress
Supported Formats
- Input: WAV, MP3, FLAC, M4A
- Output: WAV, MP3 (configurable)
- URL Support: Direct processing from YouTube and other URLs
๐ ๏ธ Project Structure
music-mcp/
โโโ mcp_server.py # Main server with Gradio interface
โโโ requirements.txt # Python dependencies
โโโ tools/ # Audio processing modules
โ โโโ stems_separation.py # Demucs-based stem separation
โ โโโ voice_replacement.py # Seed-VC voice conversion
โ โโโ music_understanding.py # Music-Flamingo AI analysis
โ โโโ pitch_alignment.py # Key detection and pitch shifting
โ โโโ time_strech.py # BPM alignment and time stretching
โ โโโ audio_cutting.py # Audio editing and manipulation
โ โโโ audio_cleaning.py # Noise removal and cleaning
โ โโโ combine_tracks.py # Track mixing and medley creation
โ โโโ audio_info.py # File information and validation
โ โโโ youtube_extract.py # YouTube audio extraction
โโโ examples/ # Sample audio files for testing
โโโ output/ # Generated audio files
โโโ youtube_downloads/ # Cached YouTube downloads
๐ฏ AI Model Details
๐ค AI Models Used
- Demucs (Facebook Research) - State-of-the-art source separation
- Seed-VC (Hugging Face) - High-quality voice conversion
- Music-Flamingo (NVIDIA) - Advanced music understanding and analysis
๐๏ธ Audio Processing Libraries
- Librosa - Audio feature extraction and analysis
- FFmpeg - Audio format conversion and processing
- PyTorch - Deep learning framework for AI models
๐จ Customization
Adding New Tools
- Create new function in appropriate
tools/module - Add wrapper function with MCP compatibility
- Register in
mcp_server.pyinterface creation - Update documentation
๐ง Development
Code Quality
# Linting
ruff check .
# Formatting
ruff format .
# Type checking
mypy . --follow-untyped-imports
Dependencies
- Core: gradio, torch, librosa, soundfile
- AI Models: demucs, transformers
- Audio Processing: ffmpeg-python, numpy, scipy
- Web: yt-dlp, requests, gradio-client
๐ช Demo Use Cases
๐ต Music Production
- Create karaoke tracks by removing vocals
- Extract stems for remixing and sampling
- Align songs for seamless DJ mixes
- Generate medleys and mashups
๐ง Audio Editing
- Clean up noisy recordings
- Extract specific sections for clips
- Create ringtones and social media content
- Repair damaged audio files
๐ค AI Experimentation
- Voice conversion for creative projects
- Genre analysis and music understanding
- Intelligent cutting point suggestions
- Structure analysis for music theory
๐ Have Fun!
This is a demo playground for exploring agents capabilities with audio processing.
Built with โค๏ธ using cutting-edge AI models and open-source audio processing libraries