YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Speech-to-Speech Model: so-vits-svc

Overview

This repository contains a speech-to-speech model, specifically the so-vits-svc, trained to mimic the voice of Chamber, a character from the game Valorant. The model is designed for speech spoofing and voice conversion applications, offering a high level of accuracy compared to other models like the RVC model, which is faster but less precise.

Model Choice

The so-vits-svc model was chosen over the RVC model due to its superior accuracy, despite the latter's speed advantage. Future plans include training the model to detect and convert a variety of innovative voices, such as transforming songs into my own voice.

Model Details: so-vits-svc Model

The so-vits-svc model is a singing voice changer that uses ViTS (Variational Inference for Text-to-Speech). This model is particularly suited for high-quality voice conversion tasks, making it ideal for applications where accuracy is crucial. Here’s a brief overview of its components and functionality:

  • ViTS (Variational Inference for Text-to-Speech): An advanced technique that combines variational inference methods with text-to-speech models, allowing for more nuanced and accurate voice conversions.
  • Singing Voice Changer: Originally designed for changing singing voices, the model can adapt to various speech patterns, making it versatile for different applications, including character voice mimicking and real-time voice conversion.

Training Details

Dataset

  • Character: Chamber from Valorant
  • Voice Lines: Approximately 500 voice lines
  • Source: Downloaded from a website providing mp3 files of Valorant character voices

Training Process

  • Epochs: 2000
  • Duration: Approximately 24 hours (including a 2-hour break)
  • Hardware: RTX 3070 GPU with 8GB VRAM

Future Work

  • The model will be further trained and experimented with to include more diverse and innovative voices. Possible applications include converting songs to a specified voice.
  • Web Application
    • Plans are underway to develop a web application that allows users to convert their voice to different characters in real-time. This application aims to provide a fun way to play games and prank friends by transforming voices into various character voices seamlessly.

Acknowledgments

  • Valorant for providing the character and voice lines.
  • Online resources for the mp3 files.

Contact


license: mit

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support