iliasslasri
/

robust_speech_quantizer

Audio Classification

Model card Files Files and versions

Metrics Training metrics Community

robust_speech_quantizer / README.md

iliasslasri's picture

Update README.md

ad1c6a4 verified 1 day ago

|

history blame contribute delete

2.02 kB

	---
	license: mit
	language:
	- en
	datasets:
	- librispeech_asr
	metrics:
	- abx
	- wer
	- ued
	pipeline_tag: audio-classification
	tags:
	- speech
	- discrete-units
	- quantization
	- hubert
	- clustering
	base_model:
	- facebook/hubert-base-ls960
	---

	# Robust Quantizer from HuBERT Base (Layer 6)

	This model checkpoint contains a Robust Quantizer trained on top of the 6th layer of the `hubert-base-ls960` model. It was developed as part of a reproduction and evaluation study on creating robust discrete speech units, originally proposed in Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling (Gat et al., 2023).

	## Model Details

	This quantizer was trained to provide discrete pseudo-labels that are resilient to various acoustic perturbations. By applying data augmentations during the quantization process, the resulting discrete units become, and by extension downstream acoustic models, more robust to noise and varying acoustic conditions.

	- Base Model: [facebook/hubert-base-ls960](https://huggingface.co/facebook/hubert-base-ls960)
	- Layer: 6
	- Vocabulary Size (Clusters): 100, 200, 500
	- Algorithm: K-Means
	- Dataset: [LibriSpeech](https://huggingface.co/datasets/librispeech_asr) (`train-clean-100`)

	## Usage

	### Download the Model

	```python
	from huggingface_hub import hf_hub_download

	model_path = hf_hub_download(repo_id="iliasslasri/robust_speech_quantizer",
	filename="500_vocab_size/round_1/E1_best.pt",
	force_download=True)
	config_path = hf_hub_download(repo_id="iliasslasri/robust_speech_quantizer",
	filename="500_vocab_size/config.yaml",
	force_download=True)
	```


	## Relevant Links
	- Original Paper: [Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling (Gat et al., 2023)](https://aclanthology.org/2023.iwslt-1.46/)
	- Project Repository: [github](https://github.com/iliasslasri/snlp_project)