W2V-BERT 2.0 ASR Adapters

This repository contains per-language bottleneck adapters for automatic speech recognition (ASR) trained on top of facebook/w2v-bert-2.0.

Model Description

  • Base Model: facebook/w2v-bert-2.0 (600M parameters, frozen)
  • Adapter Architecture: MMS-style bottleneck adapters (dim=64)
  • Decoder: Lightweight transformer decoder (1 layer)
  • Training: CTC loss with extended vocabulary for double vowels

Trained Adapters

Training in progress...

Adapter Language WER Train Samples

Architecture

The model uses:

  1. Frozen w2v-bert-2.0 encoder - Extracts audio representations
  2. Bottleneck adapter - Language-specific adaptation (trainable)
  3. Lightweight decoder - Transformer decoder block (trainable)
  4. LM head - Per-language vocabulary projection (trainable)

Usage

Each adapter folder contains:

  • adapter_weights.pt - Bottleneck adapter weights
  • decoder_weights.pt - Decoder block weights
  • lm_head_weights.pt - Language model head weights
  • final_norm_weights.pt - Final layer norm weights
  • vocab.json - Language-specific vocabulary
  • adapter_config.json - Adapter configuration
  • metrics.json - Training metrics

Loading an Adapter

import torch
from transformers import Wav2Vec2BertProcessor

# Load processor for specific language
processor = Wav2Vec2BertProcessor.from_pretrained(
    "mutisya/w2v-bert-adapters-3lang-e10-25_52-v5",
    subfolder="<adapter_id>"
)

# Load adapter weights
adapter_weights = torch.load(
    hf_hub_download("mutisya/w2v-bert-adapters-3lang-e10-25_52-v5", "<adapter_id>/adapter_weights.pt")
)

Training Configuration

  • Epochs: 10
  • Learning Rate: 0.0005
  • Batch Size: 24 ร— 2 (effective: 48)
  • Extended Vocabulary: True
  • Adapter Dimension: 64

License

Apache 2.0

Citation

@misc{w2vbert-asr-adapters,
  author = {Mutisya},
  title = {W2V-BERT 2.0 ASR Adapters for African Languages},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/mutisya/w2v-bert-adapters-3lang-e10-25_52-v5}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mutisya/w2v-bert-adapters-3lang-e10-25_52-v5

Finetuned
(400)
this model