Pipeowl-1.8.3-jp-Whitebox (Geometric Embedding)

A transformer-free semantic retrieval engine.

PipeOwl performs deterministic vocabulary scoring over a static embedding field:

score = α⋅base + (1 - α⋅base)⋅Δfield

  • BPB:用 byte 當單位
  • token NLL:用 token 當單位

token NLL: 12.943284891453972

where:

  • base = cosine similarity in embedding space
  • Δfield = static scalar field bias

Features:

  • O(n) over vocabulary.
  • No attention.
  • No transformer weights.
  • CPU-friendly (<16MB model)

Architecture

  • Static embedding table (V × D)
  • Aligned vocabulary index
  • Optional scalar bias field (Δfield)
  • Linear scoring
  • Pluggable decoder stage
  • Targeted for CPU environments and low-latency systems (e.g. IME).

Model Specs

item value
vocab size 26155
embedding dim 256
storage format safetensors (FP16)
model size ~13.2 MB
languages Japanese
startup time <1s
query latency ~1 ms (CPU, full vocabulary scan)

Quickstart

git clone https://huggingface.co/WangKaiLin/Pipeowl-1.8.3-jp-Whitebox
cd Pipeowl-1.8.3-jp-Whitebox

pip install numpy safetensors

python debug.py

Example:

Example semantic retrieval results:

Please enter words: 東京

Top-K Debug:
1 東京 | base=1.000 | delta=0.478 | final=1.000
2 は | base=-0.294 | delta=0.907 | final=0.880
3 大阪 | base=0.679 | delta=0.346 | final=0.790
4 パリ | base=0.597 | delta=0.419 | final=0.766
5 名古屋 | base=0.646 | delta=0.284 | final=0.747

Please enter words: 大阪

Top-K Debug:
1 大阪 | base=1.000 | delta=0.346 | final=1.000
2 は | base=-0.200 | delta=0.907 | final=0.889
3 東京 | base=0.679 | delta=0.478 | final=0.832
4 関西 | base=0.756 | delta=0.252 | final=0.817
5 尼崎 | base=0.710 | delta=0.367 | final=0.816

Repository Structure

Pipeowl-1.8.3-jp-Whitebox/
 ├ README.md
 ├ config.json
 ├ DATA_SOURCES.md
 ├ debug.py
 ├ LICENSE
 ├ quickstart.py
 ├ engine.py
 ├ vocabulary.json
 └ pipeowl_fp16.safetensors

LICENSE

MIT

Downloads last month
43
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for WangKaiLin/Pipeowl-1.8.3-jp-Whitebox

Finetuned
(3)
this model

Collections including WangKaiLin/Pipeowl-1.8.3-jp-Whitebox