|
|
--- |
|
|
title: Rvc Effects |
|
|
sdk: gradio |
|
|
emoji: π₯ |
|
|
colorFrom: indigo |
|
|
colorTo: purple |
|
|
--- |
|
|
# AICoverGen |
|
|
An autonomous pipeline to create covers with any RVC v2 trained AI voice from YouTube videos or a local audio file. For developers who may want to add a singing functionality into their AI assistant/chatbot/vtuber, or for people who want to hear their favourite characters sing their favourite song. |
|
|
|
|
|
<img width="1574" height="740" alt="image" src="https://github.com/user-attachments/assets/931189d8-e2e2-4240-84d6-52d7a13ac7f8" /> |
|
|
|
|
|
# AICoverGen Enhanced |
|
|
**AI-Powered Voice Cover Generation with Advanced Audio Enhancement** |
|
|
|
|
|
AICoverGen Enhanced is a powerful tool for creating AI voice covers with professional-grade audio enhancement features. This enhanced version includes advanced AI audio processing, EQ controls, dynamic range compression, and much more! |
|
|
|
|
|
--- |
|
|
|
|
|
## β¨ New Features |
|
|
|
|
|
### Advanced Audio Enhancement |
|
|
- **AI Noise Reduction** β Remove background noise and artifacts |
|
|
- **Professional EQ** β 5 EQ types: Balanced, Vocal Boost, Bass Boost, Treble Boost, Flat |
|
|
- **Dynamic Range Compression** β Improve loudness and consistency |
|
|
- **Harmonic Enhancement** β Add richness and warmth to vocals |
|
|
- **Stereo Widening** β Enhance spatial imaging for stereo tracks |
|
|
- **Reverb Control** β Add depth and professional polish |
|
|
- **Gain Control** β Fine-tune volume (-20 to +20 dB) |
|
|
|
|
|
### Enhancement Types |
|
|
- **Full** β Balanced enhancement with all features |
|
|
- **Light** β Subtle improvements for natural sound |
|
|
- **Aggressive** β Maximum enhancement for impact |
|
|
- **Custom** β Use your specific settings |
|
|
|
|
|
--- |
|
|
|
|
|
## π₯οΈ System Requirements |
|
|
|
|
|
### Minimum Requirements |
|
|
- **OS:** Windows 10/11, Linux, or macOS |
|
|
- **Python:** 3.9+ (3.10+ recommended) |
|
|
- **RAM:** 8GB minimum, 16GB recommended |
|
|
- **Storage:** 10GB free space |
|
|
- **GPU:** NVIDIA GPU with CUDA support (recommended) |
|
|
|
|
|
### Recommended Setup |
|
|
- **OS:** Windows 11 or Ubuntu 20.04+ |
|
|
- **Python:** 3.10 or 3.11 |
|
|
- **RAM:** 16GB or more |
|
|
- **GPU:** NVIDIA RTX 3060 or better |
|
|
- **CUDA:** 11.8 or 12.0+ |
|
|
- **cuDNN:** 8.6 or 9.0+ |
|
|
|
|
|
--- |
|
|
|
|
|
## βοΈ Installation Guide |
|
|
|
|
|
### Step 1: Clone the Repository |
|
|
```bash |
|
|
git clone https://github.com/SociallyIneptWeeb/AICoverGen.git |
|
|
cd AICoverGen |
|
|
``` |
|
|
|
|
|
### Step 2: Create Virtual Environment |
|
|
```bash |
|
|
# Windows |
|
|
python -m venv AICoverGen |
|
|
AICoverGen\Scripts\activate |
|
|
|
|
|
# Linux/macOS |
|
|
python3 -m venv AICoverGen |
|
|
source AICoverGen/bin/activate |
|
|
``` |
|
|
|
|
|
### Step 3: Install Dependencies |
|
|
|
|
|
#### Option A: Automatic Installation (Recommended) |
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
#### Option B: Manual Installation |
|
|
```bash |
|
|
# Core dependencies |
|
|
pip install gradio==3.50.2 librosa==0.9.1 numpy==1.23.5 scipy==1.11.1 soundfile==0.12.1 |
|
|
pip install pedalboard==0.7.7 pydub==0.25.1 fairseq==0.12.2 faiss-cpu==1.7.3 pyworld==0.3.4 |
|
|
pip install praat-parselmouth>=0.4.2 ffmpeg-python>=0.2.0 tqdm==4.65.0 yt-dlp>=2025.9.23 sox==1.4.1 |
|
|
|
|
|
# AI Audio Enhancement dependencies |
|
|
pip install noisereduce==3.0.3 scikit-learn==1.6.1 |
|
|
|
|
|
# PyTorch with CUDA support |
|
|
pip install torch==2.0.1+cu118 --find-links https://download.pytorch.org/whl/torch_stable.html |
|
|
pip install torchcrepe==0.0.20 |
|
|
|
|
|
# ONNX Runtime with CUDA support |
|
|
pip install onnxruntime-gpu==1.18.0 |
|
|
``` |
|
|
|
|
|
### Step 4: Download Models |
|
|
```bash |
|
|
python src/download_models.py |
|
|
``` |
|
|
|
|
|
### Step 5: Verify Installation |
|
|
```bash |
|
|
python src/audio_enhancer.py |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Usage |
|
|
|
|
|
### Quick Start |
|
|
```bash |
|
|
python src/webui.py |
|
|
``` |
|
|
Then open your browser and go to: [http://127.0.0.1:7860](http://127.0.0.1:7860) |
|
|
|
|
|
1. Upload a song (YouTube URL or audio file) |
|
|
2. Select a voice model from the dropdown |
|
|
3. Configure audio enhancement: |
|
|
- Expand "AI Audio Enhancement" section |
|
|
- Choose enhancement type (Full/Light/Aggressive/Custom) |
|
|
- Adjust EQ type (Balanced/Vocal Boost/Bass Boost/Treble Boost/Flat) |
|
|
- Set noise reduction strength (0β100%) |
|
|
- Adjust gain (-20 to +20 dB) |
|
|
- Set compression ratio (1β10) |
|
|
- Add reverb amount (0β100%) |
|
|
4. Click **Generate** and enjoy your enhanced AI cover! |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ Troubleshooting |
|
|
|
|
|
### CUDA Not Detected |
|
|
```bash |
|
|
# Check CUDA installation |
|
|
nvidia-smi |
|
|
|
|
|
# Verify PyTorch CUDA support |
|
|
python -c "import torch; print(torch.cuda.is_available())" |
|
|
|
|
|
# Check ONNX Runtime CUDA |
|
|
python -c "import onnxruntime as ort; print('CUDA' in ort.get_available_providers())" |
|
|
``` |
|
|
|
|
|
### Audio Enhancement Errors |
|
|
```bash |
|
|
# Test audio enhancer |
|
|
python src/audio_enhancer.py |
|
|
|
|
|
# Check dependencies |
|
|
pip list | grep -E "(noisereduce|scikit-learn|pedalboard)" |
|
|
``` |
|
|
|
|
|
### Memory Issues |
|
|
- Reduce batch size in settings |
|
|
- Use CPU-only mode for ONNX Runtime |
|
|
- Close other applications to free RAM |
|
|
|
|
|
--- |
|
|
|
|
|
## π Project Structure |
|
|
``` |
|
|
AICoverGen_Enhanced/ |
|
|
βββ src/ |
|
|
β βββ webui.py # Main web interface |
|
|
β βββ main.py # Core pipeline with audio enhancement |
|
|
β βββ audio_enhancer.py # AI audio enhancement module |
|
|
β βββ rvc.py # RVC voice conversion |
|
|
β βββ mdx.py # Audio separation |
|
|
βββ rvc_models/ # Voice models |
|
|
βββ mdxnet_models/ # Audio separation models |
|
|
βββ song_output/ # Generated covers |
|
|
βββ requirements.txt # Dependencies |
|
|
βββ README_Enhanced.md # This file |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## ποΈ Audio Enhancement Features |
|
|
|
|
|
### AI Noise Reduction |
|
|
- Uses ML to identify and remove background noise |
|
|
- Preserves vocal clarity while eliminating artifacts |
|
|
- Adjustable strength (0β100%) |
|
|
|
|
|
### Professional EQ |
|
|
- **Balanced**: Gentle mid boost for clarity |
|
|
- **Vocal Boost**: Emphasizes 800β3000 Hz range |
|
|
- **Bass Boost**: Enhances 60β250 Hz |
|
|
- **Treble Boost**: Brightens 4β16 kHz |
|
|
- **Flat**: Minimal processing with high-pass filter |
|
|
|
|
|
### Dynamic Range Compression |
|
|
- Improves loudness consistency |
|
|
- Reduces dynamic range for streaming |
|
|
- Configurable ratio (1β10) |
|
|
|
|
|
### Harmonic Enhancement |
|
|
- Adds warmth and richness |
|
|
- Uses soft saturation for natural harmonics |
|
|
|
|
|
### Stereo Widening |
|
|
- Improves spatial imaging |
|
|
- Enhances left-right separation |
|
|
- Creates immersive experience |
|
|
|
|
|
### Reverb Control |
|
|
- Adds subtle depth and space |
|
|
- Professional room simulation |
|
|
- Configurable wet/dry mix |
|
|
|
|
|
--- |
|
|
|
|
|
## π€ Contributing |
|
|
We welcome contributions! Please see our **Contributing Guidelines** for details. |
|
|
|
|
|
--- |
|
|
|
|
|
## βοΈ License |
|
|
This project is licensed under the **MIT License** β see the LICENSE file for details. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Acknowledgments |
|
|
- Original AICoverGen by **SociallyIneptWeeb** |
|
|
- RVC (Retrieval-based Voice Conversion) framework |
|
|
- MDXNet for audio separation |
|
|
- All the amazing open-source audio processing libraries |