SeoulStreamingStation
/

BlackTone_RVC_Pretrained

Model card Files Files and versions

BlackTone_RVC_Pretrained / ReadMe.md

SeoulStreamingStation's picture

SeoulStreamingStation

Update ReadMe.md

f4a9d34 verified 6 months ago

|

history blame contribute delete

1.81 kB

	---
	license: apache-2.0
	tags:
	- voice-changer
	- rvc
	- spin
	- f0-categorized
	- non-speech
	- pretrained-model
	datasets:
	- custom
	language:
	- multilingual
	---

	# 🔊 KLM BlackTone Large (Pretrained Voice Changer Model)

	KLM BlackTone Large is a pretrained model for real-time voice conversion, built using the Spin Embedder. It was trained on a massive dataset of over 11,000 hours from 595 speakers, carefully categorized by F0 range

	---

	## 🚀 Requirements

	To use this model without errors, you must use Applio v3.2.9 or later, specifically versions updated after July 15.
	Even if your Applio version is labeled 3.2.9, models updated before July 15 may still result in `mismatch` errors.

	---

	## 🌟 Key Features

	- ✅ 595 F0-Categorized Speakers
	- ✅ 11,000+ hours of curated training data
	- ✅ Spin Embedder for flexible and generalizable voice transfer
	- ✅ Real-time inference support
	- ✅ Extensive support for non-verbal sounds such as:
	- Coughs
	- Laughter
	- Whispers
	- And other expressive human vocal behaviors

	These features are made possible by including a large proportion of non-speech data in the training set.

	---

	## 📊 Recommended Usage

	To fully utilize BlackTone’s capabilities—especially non-speech inference—you should include 10–20% non-speech data in your fine-tuning dataset.

	If you do not have such data, you can try setting the Feature Index to `0`, which may enable limited inference of non-verbal sounds. However, non-speech training data is highly recommended for best results.

	---

	## 📎 License

	Apache 2.0

	---

	## 📫 Contact

	For usage reports, contributions, or collaborations, please open an issue or contact the model maintainer.