Update README.md
Browse files
README.md
CHANGED
|
@@ -19,12 +19,12 @@ datasets:
|
|
| 19 |
NeuCodec is a Finite Scalar Quantisation (FSQ) based 0.8kbps audio codec for speech tokenization.
|
| 20 |
It takes advantage of the following features:
|
| 21 |
|
| 22 |
-
* It uses both audio ([BigCodec](https://arxiv.org/pdf/2409.05377)) and semantic ([Wav2Vec2-BERT](https://huggingface.co/facebook/w2v-bert-2.0)) encoders.
|
| 23 |
* We make use of Finite Scalar Quantisation (FSQ) resulting in a single vector for the quantised output, which makes it ideal for downstream modeling with Speech Language Models.
|
| 24 |
* At 50 tokens/sec and 16 bits per token, the overall bit-rate is 0.8kbps.
|
| 25 |
* The codec takes in 16kHz input and outputs 24kHz using an upsampling decoder.
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
|
| 29 |
- **Developed by:** Neuphonic
|
| 30 |
- **Model type:** Neural Audio Codec
|
|
|
|
| 19 |
NeuCodec is a Finite Scalar Quantisation (FSQ) based 0.8kbps audio codec for speech tokenization.
|
| 20 |
It takes advantage of the following features:
|
| 21 |
|
|
|
|
| 22 |
* We make use of Finite Scalar Quantisation (FSQ) resulting in a single vector for the quantised output, which makes it ideal for downstream modeling with Speech Language Models.
|
| 23 |
* At 50 tokens/sec and 16 bits per token, the overall bit-rate is 0.8kbps.
|
| 24 |
* The codec takes in 16kHz input and outputs 24kHz using an upsampling decoder.
|
| 25 |
+
* The FSQ encoding scheme allows for bit-level error resistance suitable for unreliable and noisy channels.
|
| 26 |
|
| 27 |
+
NeuCodec is largely based on extending the work of [X-Codec2.0](https://huggingface.co/HKUSTAudio/xcodec2).
|
| 28 |
|
| 29 |
- **Developed by:** Neuphonic
|
| 30 |
- **Model type:** Neural Audio Codec
|