Spaces:

Pandaisop
/

voice-detection-api

Sleeping

App Files Files Community

voice-detection-api / trainer /README.md

vineetshukla.work@gmail.com

final commit

c5c9261 4 months ago

preview code

raw

history blame contribute delete

1.29 kB

🎙️ Voice Detection Model Trainer

This sub-project is dedicated to fine-tuning a custom AI Voice Detection model tailored to your specific audio samples and languages (Tamil, English, Hindi, Malayalam, Telugu).

🏗️ Architecture

Base Model: facebook/wav2vec2-large-xlsr-53 (Multilingual)
Task: Audio Classification (Binary: HUMAN vs AI_GENERATED)

📁 Directory Structure

data/: Put your training audio files here.
- real/: Human voice samples.
- fake/: AI generated voice samples.
output/: Fine-tuned model checkpoints will be saved here.
train.py: Main fine-tuning script.
prepare_data.py: Script to convert audio folders into Hugging Face datasets.

🚀 Getting Started

Collect Data: The more data you have, the better the accuracy. Aim for at least 100-500 samples per category per language.

Setup Environment:

pip install transformers datasets torch torchaudio accelerate

Run Training:
```
python train.py
```

🔧 Why a Custom Model?

The public models (mo-thecreator, etc.) are trained on general datasets. A custom model fine-tuned on your specific AI voices (e.g., from specific TTS engines you use) will have much higher accuracy for your use case.