| license: apache-2.0 | |
| tags: | |
| - audio | |
| - speech | |
| - language-model | |
| - auristream | |
| - discrete-diffusion | |
| library_name: transformers | |
| # AuriStreamParallel100M_Group16_BigAudioDataset_150k | |
| **AuriStream Parallel** is a discrete diffusion speech language model by **Greta Tuckute** and **Klemen Kotar**. | |
| ## Model Details | |
| | Parameter | Value | | |
| |-----------|-------| | |
| | Parameters | ~0.20B | | |
| | Layers | 12 | | |
| | Hidden Size | 768 | | |
| | Attention Heads | 12 | | |
| | Vocab Size | 8193 | | |
| | Group Size | 16 | | |
| | Mask Schedule | linear_text_prime | | |
| ## Architecture | |
| - Bidirectional transformer attention | |
| - Grouped token latent projection | |
| - Parallel token heads for group-wise prediction | |
| - Partial masking diffusion objective | |
| ## Usage | |
| ```python | |
| from transformers import AutoModel | |
| model = AutoModel.from_pretrained( | |
| "TuKoResearch/AuriStreamParallel100M_Group16_BigAudioDataset_150k", | |
| trust_remote_code=True, | |
| ) | |
| ``` | |
| ## Base Model Code | |
| This checkpoint uses shared model code from [TuKoResearch/AuriStreamParallel-base](https://huggingface.co/TuKoResearch/AuriStreamParallel-base). | |
| ## Tokenizer | |
| This model is intended for cochlear tokens, e.g. from [WavCochCausalV8192](https://huggingface.co/TuKoResearch/WavCochCausalV8192). | |