File size: 1,355 Bytes
a0cbbfd
 
 
 
8104e2a
 
 
a0cbbfd
 
8104e2a
a0cbbfd
 
8104e2a
 
a0cbbfd
 
8104e2a
 
 
 
 
 
a0cbbfd
 
 
8104e2a
 
a0cbbfd
 
 
 
8104e2a
a0cbbfd
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Vui TTS (100M base)

This repository hosts the Vui 100M base checkpoint and Fluac tokenizer used by the `fluxions/vui` project.

Contents:
- `vui-100m-base.pt`: Vui TTS checkpoint (100M parameters).
- `fluac-22hz-22khz.pt`: Fluac codec checkpoint.
- `LICENSE`: MIT license from the upstream project.

## Quick usage (Python)
```python
import torch
from huggingface_hub import hf_hub_download
from vui.inference import render
from vui.model import Vui

# Download checkpoints from this repo (returns local file paths)
ckpt = hf_hub_download("Endy2001/vui-tts", "vui-100m-base.pt")
codec_ckpt = hf_hub_download("Endy2001/vui-tts", "fluac-22hz-22khz.pt")

# Load model (pass codec checkpoint so it doesn't fetch from upstream)
model = Vui.from_pretrained_inf(ckpt, codec_checkpoint=codec_ckpt).to("cuda")

text = "Hello! This is Vui speaking from Hugging Face."
with torch.inference_mode():
    audio = render(model, text)[0].cpu().numpy()
# `audio` is a mono waveform at 24 kHz
```

## Notes
- This is a TTS model with a custom architecture; it is **not** a standard CausalLM.
- `vllm serve` only supports text-generation transformer architectures, so this checkpoint cannot be served directly via `vllm serve Endy2001/vui-tts`. Use the Python API above or the scripts in the upstream repo instead.
- Upstream code: https://github.com/fluxions-ai/vui
```