YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
WavLM Large โ English Accent Classification
This model is a fine-tuned version of microsoft/wavlm-large for classifying English accents from audio. Only the classification head was trained; the WavLM backbone weights are frozen.
Model Details
- Base model: microsoft/wavlm-large
- Task: Audio Classification
- Accents: 8 classes
- Training: Classification head only (backbone frozen)
- Framework: HuggingFace Transformers
Supported Accents
| ID | Label |
|---|---|
| 0 | american |
| 1 | australian |
| 2 | british |
| 3 | canadian |
| 4 | indian |
| 5 | irish |
| 6 | jamaican |
| 7 | scottish |
How to Use
Basic Inference
import torch
import torchaudio
from transformers import AutoFeatureExtractor, AutoModelForAudioClassification
# Load model and feature extractor
model_id = "shuyuncci/wavlm-accent"
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
model = AutoModelForAudioClassification.from_pretrained(model_id)
model.eval()
# Load audio file (must be 16kHz mono)
waveform, sample_rate = torchaudio.load("your_audio.wav")
# Resample if needed
if sample_rate != 16000:
resampler = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=16000)
waveform = resampler(waveform)
# Convert to mono if stereo
if waveform.shape[0] > 1:
waveform = waveform.mean(dim=0, keepdim=True)
# Prepare input
inputs = feature_extractor(
waveform.squeeze().numpy(),
sampling_rate=16000,
return_tensors="pt"
)
# Run inference
with torch.no_grad():
logits = model(**inputs).logits
# Get predicted label
predicted_id = logits.argmax(dim=-1).item()
predicted_label = model.config.id2label[predicted_id]
print(f"Predicted accent: {predicted_label}")
Important Notes
- Input format: 16kHz mono WAV audio
- Audio length: The model was trained on clips up to 20 seconds. For longer audio, consider splitting into chunks and aggregating predictions.
- Language: This model is designed for English speech only. Performance on non-English audio is not guaranteed.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support