Vini Pico — 49.9M Edge Personal Assistant

A 49.9M-parameter personal assistant trained from scratch using BitNet b1.58 ternary weights. Designed for everyday tasks, IoT control, and tool use on low-cost edge hardware.

What It Does

Everyday tasks: reminders, timers, Q&A, summaries
Smart home / IoT: sensors, lights, thermostat, locks, device control
Tool routing: calling the right tool instead of hallucinating
Friendly conversation: natural, helpful responses
Safety: refuses unsafe requests, admits when it doesn't know
On-device: runs locally on low-cost edge hardware, no cloud needed

Model Overview

Property	Value
Parameters	49.9M (ternary: -1, 0, +1)
Architecture	BitNet b1.58
Layers	16
Dimension	512
Attention heads	8 (2 KV, GQA)
FFN	ReLU² with SubLN
Context length	2,048
Vocabulary	32K BPE
Quantization	1.58-bit (packed ternary)
Target hardware	Raspberry Pi 4, old phones, edge devices

Training Status

Phase	Status	Details
Pretrain	Done	7,972 steps, 1.045B tokens, best loss 4.48 at step 7500
SFT iter 1–4	Done	17 bugs fixed, code/data issues resolved
SFT iter 5	Pending	Fixed SFT dataset is ready; rerun is blocked on the planned credit refresh
DPO	Conditional	Only if SFT quality gates pass

Dataset State

The current canonical dataset for Pico is the content-verified merged snapshot under:

vini/models/pico/data/generated/runs/run_20260524_1322/merged/dataset.jsonl

Current canonical snapshot:

5,965 records
chat-style JSONL
one top-level key per record: messages
verified metadata stored beside the dataset file

This corpus is a small-assistant curriculum, not a tool-only corpus and not a general coding corpus.

Repository Contents

Model artifacts: configurations, checkpoints, tokenizer, and exports.

Path	Contents
`pico_config.yaml`	Pico architecture config
`shared/`	Training configs (pretrain, SFT, DPO), tokenizer
`checkpoints/`	Future: model weights (`.pt`, `.safetensors`)
`exports/`	Future: `.gguf`, `.onnx`, `.mlpackage`

Companion Repositories

Repo	Platform	Contents
jayptl-me/vini	GitHub	Full source code, docs, scripts
jayptl-rq/vini-pico-dataset	HF Datasets	Training data

Architecture

BitNet b1.58: Ternary weights (-1, 0, +1) — efficient inference for edge deployment
ReLU² FFN: strong sparsity and good fit for ternary weights
SubLN: Prevents dead layers during training
GQA: Grouped-Query Attention (8 heads, 2 KV)
RoPE: Rotary Position Embeddings

Training Configuration

Optimizer: AdamW (β1=0.9, β2=0.95), BF16 mixed precision
LR Schedule: Two-stage cosine (4e-3 → 4e-4 cooldown → min_lr fix in training code)
Batch: 64 effective (BS=4, GA=16)
Framework: PyTorch with Straight-Through Estimator (STE)

Model Ladder

Model	Params	Target Hardware	Status
Pico	49.9M	Raspberry Pi 4	Active
Nano	~100M	Raspberry Pi 5	Planned
Micro	~500M	Modern phones	Planned
Mini	~1B	Desktop	Planned

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track