Speech-to-Speech Model: so-vits-svc

Overview

This repository contains a speech-to-speech model, specifically the so-vits-svc, trained to mimic the voice of Chamber, a character from the game Valorant. The model is designed for speech spoofing and voice conversion applications, offering a high level of accuracy compared to other models like the RVC model, which is faster but less precise.

Model Choice

The so-vits-svc model was chosen over the RVC model due to its superior accuracy, despite the latter's speed advantage. Future plans include training the model to detect and convert a variety of innovative voices, such as transforming songs into my own voice.

Model Details: so-vits-svc Model

The so-vits-svc model is a singing voice changer that uses ViTS (Variational Inference for Text-to-Speech). This model is particularly suited for high-quality voice conversion tasks, making it ideal for applications where accuracy is crucial. Here’s a brief overview of its components and functionality:

ViTS (Variational Inference for Text-to-Speech): An advanced technique that combines variational inference methods with text-to-speech models, allowing for more nuanced and accurate voice conversions.
Singing Voice Changer: Originally designed for changing singing voices, the model can adapt to various speech patterns, making it versatile for different applications, including character voice mimicking and real-time voice conversion.

Training Details

Dataset

Character: Chamber from Valorant
Voice Lines: Approximately 500 voice lines
Source: Downloaded from a website providing mp3 files of Valorant character voices

Training Process

Epochs: 2000
Duration: Approximately 24 hours (including a 2-hour break)
Hardware: RTX 3070 GPU with 8GB VRAM

Future Work

The model will be further trained and experimented with to include more diverse and innovative voices. Possible applications include converting songs to a specified voice.
Web Application
- Plans are underway to develop a web application that allows users to convert their voice to different characters in real-time. This application aims to provide a fun way to play games and prank friends by transforming voices into various character voices seamlessly.

Acknowledgments

Valorant for providing the character and voice lines.
Online resources for the mp3 files.

Contact

Made with passion by Anirudh Sai Lanka.
For any queries or contributions, please contact me at anirudh2002sai1234@gmail.com.

license: mit

Downloads last month: 7

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support