Hubert-Base

Model Details

Model Description

Model type: Hubert-Base
Language(s) (NLP): English
License: MIT

How to Get Started with the Model

Use the code below to get started with the model.

sudo apt install git-lfs  # for UTMOS

conda create -y -n py310 -c pytorch -c nvidia -c conda-forge python=3.10.18 pip=24.0 faiss-gpu=1.12.0
conda activate py310
pip install -r requirements/requirements.txt

sh scripts/setup.sh

import torchaudio

from src.s5hubert import S5HubertForSyllableDiscovery

wav_path = "/path/to/wav"

# download pretrained models from hugging face hub
encoder = S5HubertForSyllableDiscovery.from_pretrained("ryota-komatsu/hubert", device_map="cuda")

# load a waveform
waveform, sr = torchaudio.load(wav_path)
waveform = torchaudio.functional.resample(waveform, sr, 16000)

# encode a waveform into syllabic units
outputs = encoder(waveform.to(encoder.device))
units = outputs[0]["units"]  # [3950, 67, ..., 503]

Downloads last month: 11

Safetensors

Model size

0.1B params

Tensor type

I64

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ryota-komatsu/hubert

Base model

facebook/hubert-base-ls960

Finetuned

(147)

this model

Collection including ryota-komatsu/hubert

SylReg

Collection

31 items • Updated 9 days ago