PipeOwl-1.11-bilingual (Geometric Embedding)

A transformer-free semantic retrieval engine.

Features:

O(n) over vocabulary.
No attention.
No transformer weights.

Architecture

Static embedding table (V × D)
Aligned vocabulary index
Linear scoring
Pluggable decoder stage

Model Specs

item	value
token size	524190
embedding dim	256
storage format	safetensors (FP16)
data size	~267 MB
languages	bilingual
startup time	~478 ms
query latency	~25-34 ms

Quickstart

git clone https://huggingface.co/WangKaiLin/PipeOwl-1.11-bilingual
cd PipeOwl-1.11-bilingual

pip install numpy safetensors

python quickstart.py

Example:

Example semantic retrieval results:

請輸入句子： 確實

Top-K Tokens:
1.000 | 確實
0.877 | 的確
0.839 | 确实
0.772 | 的确
0.757 | 事實上

請輸入句子： 今天好想睡覺

Top-K Tokens:
0.761 | 今天
0.747 | 今天的
0.694 | 睡觉
0.693 | 刚才
0.685 | 现在

請輸入句子： i want to sleep

Top-K Tokens:
0.719 | sleep
0.663 | schlafen
0.638 | want
0.616 | sleeping
0.616 | tidur

請輸入句子： 哈囉你好阿

Top-K Tokens:
0.825 | 哈囉
0.818 | 你好
0.769 | 嘿
0.759 | 嗨
0.750 | Kaixo

Repository Structure

PipeOwl-1.11-bilingual/
 ├ README.md
 ├ config.json
 ├ LICENSE
 ├ quickstart.py
 ├ engine.py
 ├ tokenizer.json
 └ pipeowl.safetensors

🔌 Optional: RAG Integration

PipeOwl can be combined with the RAG pipeline from PipeOwl-1.10.2-tw-wiki-rag.

You can reuse the wiki retrieval layer and directly plug it on top of PipeOwl-1.11 embeddings.

PipeOwl provides fast semantic token retrieval

RAG layer provides document-level grounding

Integration Approach

Replace the embedding backend with PipeOwl-1.11

Keep the existing:

wiki index
entity layer
merge retriever

Use PipeOwl output tokens as query expansion

Result

⚡ Faster retrieval (edge-ready)
📚 Same wiki grounding capability
🧠 Better semantic recall with bilingual support

LICENSE

MIT

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collections including WangKaiLin/PipeOwl-1.11-bilingual

PipeOwl

Collection

A transformer-free semantic retrieval engine. • 13 items • Updated Apr 25

Latest

Collection

4 items • Updated Apr 25