focalcodec
Collection
10 items
β’
Updated
β’
2
A variable-frame-rate 16 kHz speech codec based on FocalCodec.
This repository contains the checkpoint trained on LibriTTS 960, as described in the preprint.
The IVF index index.faiss contains continuous latents at 50 Hz from LibriSpeech train-clean-100, dev-clean, and test-clean.
π Papers:
πΎ GitHub: https://github.com/lucadellalib/dycast
See the readme at: https://github.com/lucadellalib/dycast
@article{dellalibera2026dycast,
title = {Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization},
author = {Luca {Della Libera} and Cem Subakan and Mirco Ravanelli},
journal = {arXiv preprint arXiv:2601.23174},
year = {2026},
}
@article{dellalibera2025focalcodecstream,
title = {{FocalCodec-Stream}: Streaming Low-Bitrate Speech Coding via Causal Distillation},
author = {Luca {Della Libera} and Cem Subakan and Mirco Ravanelli},
journal = {arXiv preprint arXiv:2509.16195},
year = {2025},
}
@inproceedings{dellalibera2025focalcodec,
title = {{FocalCodec}: Low-Bitrate Speech Coding via Focal Modulation Networks},
author = {Luca {Della Libera} and Francesco Paissan and Cem Subakan and Mirco Ravanelli},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
}
Base model
facebook/mms-1b-all