Spaces:
Sleeping
Sleeping
ποΈ Voice Detection Model Trainer
This sub-project is dedicated to fine-tuning a custom AI Voice Detection model tailored to your specific audio samples and languages (Tamil, English, Hindi, Malayalam, Telugu).
ποΈ Architecture
- Base Model:
facebook/wav2vec2-large-xlsr-53(Multilingual) - Task: Audio Classification (Binary: HUMAN vs AI_GENERATED)
π Directory Structure
data/: Put your training audio files here.real/: Human voice samples.fake/: AI generated voice samples.
output/: Fine-tuned model checkpoints will be saved here.train.py: Main fine-tuning script.prepare_data.py: Script to convert audio folders into Hugging Face datasets.
π Getting Started
- Collect Data: The more data you have, the better the accuracy. Aim for at least 100-500 samples per category per language.
- Setup Environment:
pip install transformers datasets torch torchaudio accelerate - Run Training:
python train.py
π§ Why a Custom Model?
The public models (mo-thecreator, etc.) are trained on general datasets. A custom model fine-tuned on your specific AI voices (e.g., from specific TTS engines you use) will have much higher accuracy for your use case.