| # Vui TTS (100M base) | |
| This repository hosts the Vui 100M base checkpoint and Fluac tokenizer used by the `fluxions/vui` project. | |
| Contents: | |
| - `vui-100m-base.pt`: Vui TTS checkpoint (100M parameters). | |
| - `fluac-22hz-22khz.pt`: Fluac codec checkpoint. | |
| - `LICENSE`: MIT license from the upstream project. | |
| ## Quick usage (Python) | |
| ```python | |
| import torch | |
| from huggingface_hub import hf_hub_download | |
| from vui.inference import render | |
| from vui.model import Vui | |
| # Download checkpoints from this repo (returns local file paths) | |
| ckpt = hf_hub_download("Endy2001/vui-tts", "vui-100m-base.pt") | |
| codec_ckpt = hf_hub_download("Endy2001/vui-tts", "fluac-22hz-22khz.pt") | |
| # Load model (pass codec checkpoint so it doesn't fetch from upstream) | |
| model = Vui.from_pretrained_inf(ckpt, codec_checkpoint=codec_ckpt).to("cuda") | |
| text = "Hello! This is Vui speaking from Hugging Face." | |
| with torch.inference_mode(): | |
| audio = render(model, text)[0].cpu().numpy() | |
| # `audio` is a mono waveform at 24 kHz | |
| ``` | |
| ## Notes | |
| - This is a TTS model with a custom architecture; it is **not** a standard CausalLM. | |
| - `vllm serve` only supports text-generation transformer architectures, so this checkpoint cannot be served directly via `vllm serve Endy2001/vui-tts`. Use the Python API above or the scripts in the upstream repo instead. | |
| - Upstream code: https://github.com/fluxions-ai/vui | |
| ``` | |