voxyne / README.md
decipherpunk's picture
voxyne v0.1: bf16 base + fp32/int8/int4 ONNX (int4 default) + model card
de4bdad verified
|
Raw
History Blame Contribute Delete
2.85 kB
---
license: cc-by-nc-4.0
language:
- en
pipeline_tag: text-generation
tags:
- byte-level
- conversational
- rahuketu
- voxyne
- int4
- edge
- cpu
---
# Voxyne
A tiny (**128.8M-param**) **byte-level** conversational language model with the
RahuKetu `sigma_K` control channel. It runs **offline on CPU**, and quantized to
**int4 it stays coherent in ~80 MB** β€” the intended default.
**Scope β€” please read.** Voxyne is built to *converse*, not to *know*. It has an
identity and conversational ability; it is **not a knowledge base** and *will*
make up facts if asked β€” knowledge is meant to come from external tools/retrieval
(a separate system). Judge it on **coherence**, not factual recall.
## Files β€” **int4 is the default**
| file | size | use |
|---|---|---|
| **`voxyne-int4.onnx`** | **81 MB** | **DEFAULT** β€” CPU/edge deployment (ONNX Runtime) |
| `voxyne-int8.onnx` | 129 MB | int8 ONNX |
| `voxyne-fp32.onnx` | 511 MB | full-precision ONNX |
| `voxyne-v0.1.pt` | 258 MB | bf16 PyTorch weights (for the `voxyne` package / fine-tuning) |
The ONNX graphs are a **single-token decode step** with an explicit KV cache
(inputs `h`, `pk`, `pv` -> outputs `logits`, `npk`, `npv`); the byte embedding and
the `sigma_K` encoder run outside the graph. See the `voxyne` package for the
decode protocol.
## Quick start (PyTorch)
```bash
pip install voxyne # then download voxyne-v0.1.pt from this repo
```
```python
from voxyne import VoxyneConfig, build, load_weights, generate
model, enc = build(VoxyneConfig())
load_weights(model, "voxyne-v0.1.pt")
print(generate(model, enc, "who are you?", device="cpu"))
# -> "I'm Voxyne, an AI assistant created by Ramakrishnan."
```
## Training-data provenance (why the license)
Voxyne's weights are trained on a mix that includes **non-commercial** sources, so
the **weights are released under CC BY-NC 4.0** (non-commercial). Free for research,
education, and personal use.
| stage | sources (examples) | license note |
|---|---|---|
| pretrain | FineWeb-Edu, Cosmopedia, TinyStories | permissive / synthetic |
| grammar | WordNet, FrameNet, GoEmotions | permissive (attribution) |
| commonsense | ConceptNet, ATOMIC | ConceptNet = CC BY-SA |
| dialogue | SODA, UltraChat, OASST2, daily_dialog, empathetic_dialogues, WildChat | **daily_dialog / empathetic = CC BY-NC; WildChat = AI2 ImpACT** |
| identity | author-written | original |
The **code** (`voxyne` package) is Apache-2.0; only the **weights** are NC. A future
clean-data retrain (`kalki`) will carry Apache-licensed weights.
## AI-assistance disclosure
Built by Ramakrishnan (ORCID `0009-0006-0905-7275`). AI tools assisted with the
training/quantization automation and tooling; the model design, direction, and
all decisions are the author's.
## License
Weights: **CC BY-NC 4.0** (non-commercial). Code: Apache-2.0.