mHubert Basque Discrete Units (k=1000, L9)

Model Summary

This repository provides a fine-tuned mHubert (Multilingual HuBERT) model specifically optimized for the Basque language. It is designed to transform raw audio signals into discrete unit sequences, which serve as a compact, symbolic representation of speech.

The model extracts high-level acoustic and phonetic features from the 9th transformer layer (Layer 9). These features are then quantized using a KMeans model with 1000 clusters. This representation is widely used in generative speech research, including unit-based Vocoders.

Key Features

  • Base Model: mHubert (Multilingual HuBERT) fine-tuned for Basque.
  • Quantization: KMeans with $k=1000$ clusters.
  • Extraction Layer: Layer 9 (L9).
  • Input: 16 kHz Basque speech audio.
  • Output: 1D sequence of discrete unit IDs (indices 0–999).
  • Primary Use Case: Speech discretization for generative modeling and unit-based synthesis.

Technical Specifications

Feature Detail
Sampling Rate 16,000 Hz
Transformer Layers 12
Feature Layer 9
Vocabulary Size 1000 units
Language Basque (Euskara)

How to Use

To extract discrete units from an audio file, you will need transformers, torch, torchaudio, and joblib.

Installation

pip install torch torchaudio transformers joblib huggingface_hub

Inference


from huggingface_hub import hf_hub_download
import joblib
from transformers import Wav2Vec2Processor, HubertModel
from torchaudio import load
import torch

hf_hub_download(repo_id="Ansu/mHubert-basque-k1000-L9", filename="kmeans/basque_hubert_k1000_L9.pt", local_dir="./")

kmeans = joblib.load("kmeans/basque_hubert_k1000_L9.pt")

model_name = "Ansu/mHubert-basque-k1000-L9"
processor = Wav2Vec2Processor.from_pretrained(model_name)
model = HubertModel.from_pretrained(model_name)
model.eval()

audio = load("path/to/audio")[0]
audio = audio.squeeze(0)

inputs = processor(audio, sampling_rate=16000, return_tensors="pt", padding=True)

with torch.no_grad():
    out = model(**inputs, output_hidden_states=True)

features = out.hidden_states[9].squeeze(0).numpy()

units = kmeans.predict(features)
Downloads last month
98
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ansu/mHubert-basque-k1000-L9

Finetuned
(9)
this model

Dataset used to train Ansu/mHubert-basque-k1000-L9

Collection including Ansu/mHubert-basque-k1000-L9