Instructions to use lagosproject/quevedo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lagosproject/quevedo with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("lagosproject/quevedo", dtype="auto") - Notebooks
- Google Colab
- Kaggle
π£οΈ Quevedo Voice Model (so-vits-svc-fork)
This repository contains the voice model of the Spanish singer Quevedo, trained for use with the so-vits-svc-fork library (version 3.10.3+ / 4.0.0+).
π Table of Contents
- Model Specifications
- Repository Structure
- Quick Installation
- CLI Usage
- Python API Usage
- Gradio WebUI Interface
- Hugging Face Spaces Deployment
- Optimization & Tuning Tips
- Ethical Disclaimer
π Model Specifications
| Feature | Value |
|---|---|
| Speaker ID | quevedo (Index: 0) |
| Sampling Rate | 44100 Hz (44.1 kHz) |
| Base Architecture | VITS with SoftVC content encoder (HuBERT) |
| Fork Target Version | so-vits-svc-fork v3.x / v4.x |
| Pipeline Tag | Audio-to-Audio (Singing/Speech Voice Conversion) |
π Repository Structure
G_777.pth: Generator model weight file (Git LFS).config.json: Model configuration file detailing training hyperparameters and speaker metadata.app.py: Sleek, custom-themed interactive graphical interface built with Gradio.requirements.txt: Package requirements to run the inference and the Web UI.assets/banner.png: Cover image representing the model repository.
π οΈ Quick Installation
To run this model on your local machine, set up a Python environment first (Python 3.10 or 3.11 is recommended):
# 1. Clone the repository
git clone https://huggingface.co/lagosproject/quevedo
cd quevedo
# 2. Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
You must have FFmpeg installed on your system for audio file processing. If you are on Ubuntu/Debian, run
sudo apt install ffmpeg. On macOS/Windows, install it via your preferred package manager (e.g.brew install ffmpegorchoco install ffmpeg).
π» CLI Usage
Perform voice conversions directly from your terminal using the svc console script:
# Basic inference
svc infer path/to/input.wav -m G_777.pth -c config.json -s quevedo -o output.wav
# Transposed inference (+3 semitones for high pitch shifts)
svc infer path/to/input.wav -m G_777.pth -c config.json -s quevedo -t 3 -fm crepe -o output.wav
Useful CLI arguments:
-m/--model-path: Path to the generator checkpoint (G_777.pth).-c/--config-path: Path to the configuration file (config.json).-s/--spk-list: Speaker name (quevedo).-t/--trans: Pitch shift in semitones (negative numbers shift pitch down, positive numbers shift pitch up).-fm/--f0-method: Pitch tracking algorithm. Recommended choices:crepe(highest accuracy) ordio(fastest).
π Python API Usage
To run voice conversion programmatically inside a custom Python script:
from pathlib import Path
from so_vits_svc_fork.inference.main import infer
# Configure paths
input_audio = Path("vocals_input.wav")
output_audio = Path("quevedo_output.wav")
model_path = Path("G_777.pth")
config_path = Path("config.json")
# Execute inference
infer(
input_path=input_audio,
output_path=output_audio,
model_path=model_path,
config_path=config_path,
recursive=False,
speaker="quevedo",
transpose=0, # Adjust if input vocals are in a different octave
auto_predict_f0=False, # Keep False for singing (preserves melody), True for speaking
f0_method="crepe", # Crepe offers the highest quality pitch extraction
noise_scale=0.4
)
print(f"Conversion complete: {output_audio}")
π¨ Gradio WebUI Interface
The repository contains a sleek, modern, web interface built with Gradio. To run it locally:
python app.py
Once it starts, navigate to http://localhost:7860 in your web browser.
UI Highlights:
- Drag & Drop Upload: Easily upload any WAV/MP3 files or record directly from your microphone.
- Visual Parameters Control: Adjust Pitch Shift, F0 Predictor (
crepe,dio,harvest), and Noise Scale interactively. - Responsive Layout: Designed with a clean glassmorphism dark-mode theme using customized indigo and purple gradients.
π Hugging Face Spaces Deployment
To make this model interactive online for public use without requiring local installation:
- Create a new Space on your Hugging Face account.
- Select Gradio as the Space SDK.
- Choose your hardware (a free CPU basic instance is fine, but GPU hardware speeds up inference considerably).
- Upload all files from this repository to the Space (including
app.py,requirements.txt,config.json,G_777.pthand theassets/folder). - The Space will build and deploy the WebUI automatically.
π‘ Optimization & Tuning Tips
Follow these guidelines to achieve the best output vocal quality for Quevedo:
- Pitch Adjustments: Quevedo has a deep, resonant baritone singing range.
- If the source vocals are from a female singer, apply a negative pitch shift (typically -8 to -12 semitones).
- If the source vocals are from a male tenor singer, shift down by -3 to -6 semitones.
- If the source vocals are already in a deep baritone range, keep the transposition at 0.
- Singing vs. Speech:
- For songs, disable
Auto Predict F0to maintain the precise pitch notes of the original track. - For speech/voice acting, enable
Auto Predict F0so the model generates natural speech intonation.
- For songs, disable
- Vocal Preparation:
- Input audio files must be clean, dry acapellas. Background instruments, beats, reverb, noise, or echo will distort the output audio.
- For long inputs (more than 45 seconds), slice the audio into smaller files to avoid running out of memory (OOM).
β οΈ Ethical Disclaimer
This model is intended for artistic, research, and educational purposes. It should not be used to impersonate individuals for fraudulent, misleading, or defamatory purposes.
- If you share covers or musical works created using this model, please label them clearly as AI covers (e.g., "AI Cover").
- Respect local regulations and the moral rights of the original artist. The author of this repository is not responsible for malicious usage by third parties.
- Downloads last month
- 48
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("lagosproject/quevedo", dtype="auto")