YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Wav2Vec2 Accent Classifier
Overview
This model is a fine-tuned version of facebook/wav2vec2-base for accent classification. It has been trained on the Speech Accent Archive for 14 epochs, achieving ~97% accuracy on validation set. The model generalizes well to unseen data and can classify speech into 11 different accents.
Model Repository
Repository: vrund1346/wav2vec2_accent_classification_v2
Developer: Vrund Dobariya
Supported Accents
- Arabic, Dutch, English, French, German, Korean, Mandarin, Portuguese, Russian, Spanish, Turkish
Label to Accent Mappings
| Label | Accent |
|---|---|
| 0 | Arabic |
| 1 | Dutch |
| 2 | English |
| 3 | French |
| 4 | German |
| 5 | Korean |
| 6 | Mandarin |
| 7 | Portuguese |
| 8 | Russian |
| 9 | Spanish |
| 10 | Turkish |
Usage
You can use this model to classify accents in spoken audio. Given an input speech waveform, it outputs confidence scores for each accent along with the most probable accent prediction.
Performance
- Training Data: Speech Accent Archive
- Epochs: 14
- Accuracy: ~98%
- Model Architecture: Wav2Vec2
- Generalization: Works well on unseen data
Installation & Inference
To use this model, install the necessary dependencies:
pip install torch librosa transformers
Then, run inference using transformers:
from transformers import Wav2Vec2Processor, Wav2Vec2ForSequenceClassification
import torch
import librosa
# Load model and processor
model_name = "vrund1346/wav2vec2_accent_classification_v2"
processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base")
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_name)
# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)
input_values = processor(audio, return_tensors="pt", sampling_rate=16000).input_values
# Predict
with torch.no_grad():
logits = model(input_values).logits
predicted_accent = torch.argmax(logits, dim=-1).item()
print("Predicted Accent:", predicted_accent)
Applications
- Speech Analysis: Identify accents for linguistic studies
- Personalization: Adapt speech systems to user accents
License
This project is open-source and available under the Apache License 2.0.
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support