A newer version of the Gradio SDK is available:
6.5.1
title: Rvc Effects
sdk: gradio
emoji: π₯
colorFrom: indigo
colorTo: purple
AICoverGen
An autonomous pipeline to create covers with any RVC v2 trained AI voice from YouTube videos or a local audio file. For developers who may want to add a singing functionality into their AI assistant/chatbot/vtuber, or for people who want to hear their favourite characters sing their favourite song.
AICoverGen Enhanced
AI-Powered Voice Cover Generation with Advanced Audio Enhancement
AICoverGen Enhanced is a powerful tool for creating AI voice covers with professional-grade audio enhancement features. This enhanced version includes advanced AI audio processing, EQ controls, dynamic range compression, and much more!
β¨ New Features
Advanced Audio Enhancement
- AI Noise Reduction β Remove background noise and artifacts
- Professional EQ β 5 EQ types: Balanced, Vocal Boost, Bass Boost, Treble Boost, Flat
- Dynamic Range Compression β Improve loudness and consistency
- Harmonic Enhancement β Add richness and warmth to vocals
- Stereo Widening β Enhance spatial imaging for stereo tracks
- Reverb Control β Add depth and professional polish
- Gain Control β Fine-tune volume (-20 to +20 dB)
Enhancement Types
- Full β Balanced enhancement with all features
- Light β Subtle improvements for natural sound
- Aggressive β Maximum enhancement for impact
- Custom β Use your specific settings
π₯οΈ System Requirements
Minimum Requirements
- OS: Windows 10/11, Linux, or macOS
- Python: 3.9+ (3.10+ recommended)
- RAM: 8GB minimum, 16GB recommended
- Storage: 10GB free space
- GPU: NVIDIA GPU with CUDA support (recommended)
Recommended Setup
- OS: Windows 11 or Ubuntu 20.04+
- Python: 3.10 or 3.11
- RAM: 16GB or more
- GPU: NVIDIA RTX 3060 or better
- CUDA: 11.8 or 12.0+
- cuDNN: 8.6 or 9.0+
βοΈ Installation Guide
Step 1: Clone the Repository
git clone https://github.com/SociallyIneptWeeb/AICoverGen.git
cd AICoverGen
Step 2: Create Virtual Environment
# Windows
python -m venv AICoverGen
AICoverGen\Scripts\activate
# Linux/macOS
python3 -m venv AICoverGen
source AICoverGen/bin/activate
Step 3: Install Dependencies
Option A: Automatic Installation (Recommended)
pip install -r requirements.txt
Option B: Manual Installation
# Core dependencies
pip install gradio==3.50.2 librosa==0.9.1 numpy==1.23.5 scipy==1.11.1 soundfile==0.12.1
pip install pedalboard==0.7.7 pydub==0.25.1 fairseq==0.12.2 faiss-cpu==1.7.3 pyworld==0.3.4
pip install praat-parselmouth>=0.4.2 ffmpeg-python>=0.2.0 tqdm==4.65.0 yt-dlp>=2025.9.23 sox==1.4.1
# AI Audio Enhancement dependencies
pip install noisereduce==3.0.3 scikit-learn==1.6.1
# PyTorch with CUDA support
pip install torch==2.0.1+cu118 --find-links https://download.pytorch.org/whl/torch_stable.html
pip install torchcrepe==0.0.20
# ONNX Runtime with CUDA support
pip install onnxruntime-gpu==1.18.0
Step 4: Download Models
python src/download_models.py
Step 5: Verify Installation
python src/audio_enhancer.py
π Usage
Quick Start
python src/webui.py
Then open your browser and go to: http://127.0.0.1:7860
- Upload a song (YouTube URL or audio file)
- Select a voice model from the dropdown
- Configure audio enhancement:
- Expand "AI Audio Enhancement" section
- Choose enhancement type (Full/Light/Aggressive/Custom)
- Adjust EQ type (Balanced/Vocal Boost/Bass Boost/Treble Boost/Flat)
- Set noise reduction strength (0β100%)
- Adjust gain (-20 to +20 dB)
- Set compression ratio (1β10)
- Add reverb amount (0β100%)
- Click Generate and enjoy your enhanced AI cover!
π οΈ Troubleshooting
CUDA Not Detected
# Check CUDA installation
nvidia-smi
# Verify PyTorch CUDA support
python -c "import torch; print(torch.cuda.is_available())"
# Check ONNX Runtime CUDA
python -c "import onnxruntime as ort; print('CUDA' in ort.get_available_providers())"
Audio Enhancement Errors
# Test audio enhancer
python src/audio_enhancer.py
# Check dependencies
pip list | grep -E "(noisereduce|scikit-learn|pedalboard)"
Memory Issues
- Reduce batch size in settings
- Use CPU-only mode for ONNX Runtime
- Close other applications to free RAM
π Project Structure
AICoverGen_Enhanced/
βββ src/
β βββ webui.py # Main web interface
β βββ main.py # Core pipeline with audio enhancement
β βββ audio_enhancer.py # AI audio enhancement module
β βββ rvc.py # RVC voice conversion
β βββ mdx.py # Audio separation
βββ rvc_models/ # Voice models
βββ mdxnet_models/ # Audio separation models
βββ song_output/ # Generated covers
βββ requirements.txt # Dependencies
βββ README_Enhanced.md # This file
ποΈ Audio Enhancement Features
AI Noise Reduction
- Uses ML to identify and remove background noise
- Preserves vocal clarity while eliminating artifacts
- Adjustable strength (0β100%)
Professional EQ
- Balanced: Gentle mid boost for clarity
- Vocal Boost: Emphasizes 800β3000 Hz range
- Bass Boost: Enhances 60β250 Hz
- Treble Boost: Brightens 4β16 kHz
- Flat: Minimal processing with high-pass filter
Dynamic Range Compression
- Improves loudness consistency
- Reduces dynamic range for streaming
- Configurable ratio (1β10)
Harmonic Enhancement
- Adds warmth and richness
- Uses soft saturation for natural harmonics
Stereo Widening
- Improves spatial imaging
- Enhances left-right separation
- Creates immersive experience
Reverb Control
- Adds subtle depth and space
- Professional room simulation
- Configurable wet/dry mix
π€ Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
βοΈ License
This project is licensed under the MIT License β see the LICENSE file for details.
π Acknowledgments
- Original AICoverGen by SociallyIneptWeeb
- RVC (Retrieval-based Voice Conversion) framework
- MDXNet for audio separation
- All the amazing open-source audio processing libraries