vec2slug-v1-openai-small

Generate URL slugs directly from text embeddings, without re-feeding source text through a language model. Designed to piggyback on embeddings a system already has for search or deduplication.

Parameters 11.5M
Architecture Transformer decoder, 4L, d=384
Input OpenAI text-embedding-3-small (1536d)
Vocab BPE, 5000 subwords
Token F1 0.298
ONNX size 44.3 MiB
Inference (CPU) ~21ms (M-series), ~89ms (budget VPS)

14 to 19× faster and approximately 85× cheaper than a Haiku-class LLM call for the same task, including the cost of computing a fresh embedding. With existing embeddings (the intended use case), approximately 2,000× cheaper.

This is the smaller of two variants. It is recommended for most deployments: the larger model adds only +0.008 Token F1 at 2x the inference cost.

See also: Vec2Slug V1-Openai-Large

Quickstart

# install dependencies
pip install onnxruntime numpy

# or run directly with uv
uv run inference.py . --input embeddings.npy
from inference import OnnxPredictor
import numpy as np

predictor = OnnxPredictor.from_dir(".")

# embeddings: [N, 1536] float32 from OpenAI text-embedding-3-small
slugs = predictor.predict(embeddings)
# ["how-neural-networks-learn", "climate-change-solutions", ...]

PyTorch inference (requires torch):

from inference import PyTorchPredictor

predictor = PyTorchPredictor.from_dir(".")
slugs = predictor.predict(embeddings)

Examples

Predictions on held-out test samples (beam search, width 4). The model sees only the 1536-dim embedding, never the source text.

Source text Reference slug Predicted slug
Children's book about astronomy and living on Mars can-we-live-on-mars can-we-live-on-mars
Teaching resources for Martin Luther King Jr. Day celebrating-martin-luther-king-jr-day celebrating-martin-luther-king-jr-day
Article about Waldorf education practices 12-things-may-not-know-waldorf-education 10-things-you-didnt-know-about-waldorf-education

The third example illustrates the typical case: the model captures the topic correctly but diverges in specific wording. The common failure mode is overgeneralization rather than incoherence.

How it works

The model is a prefix-conditioned transformer decoder. A precomputed text embedding is linearly projected into the decoder's hidden space and placed at position 0 as a prefix token. The decoder then autoregressively generates BPE subword tokens that form a kebab-case URL slug.

Beam search uses bounded additive length reward with score-based optimal stopping (Huang et al. 2017). All decoding parameters are stored in model.json.

Files

File Description
model.onnx ONNX model (forward pass only)
model.json Sidecar: vocabulary, beam search config, stopwords
model.pt PyTorch weights (state_dict)
tokenizer.json BPE tokenizer (HuggingFace tokenizers format)
inference.py Standalone inference script (uv run compatible)
manifest.train.json Training configuration and results
manifest.onnx.json Export verification (tolerance, argmax agreement)
history.train.jsonl Training loss/metric curves

Training

Trained on 2.3M documents from FineWeb-Edu with slugs extracted from source URLs. The extraction pipeline filters on language, slug format, Gopher repetition, and token count.

BPE vocabulary (5,000 subwords) with - as a special token. Trained for 30 epochs with label smoothing (0.1) and position-aware EOS loss weighting. Best checkpoint at step 53,430.

Evaluation

Evaluated on 5,000 held-out test samples using the full beam search decoding pipeline.

Metric Value
Token F1 (macro) 0.298
Exact match 1.9%
ROUGE-L 0.277
BERTScore F1 0.869
Validity 100%
Vocab diversity 97.3%

Token F1 splits both slugs on hyphens and computes set-overlap F1 (order ignored). ROUGE-L measures the longest common subsequence and penalizes misordered words. BERTScore computes contextual embedding similarity via roberta-large; the floor is high (~0.82) because short English slugs are not widely separated in that embedding space.

Limitations

  • Requires precomputed embeddings from OpenAI text-embedding-3-small. Other embedding models will produce poor results.
  • Trained on English web content. Non-English or domain-specific text may produce generic or inaccurate slugs.
  • Slugs reflect patterns in the training URLs, which include SEO-influenced and editorially inconsistent sources.
  • The primary failure mode is overgeneralization: the model captures the topic but may miss specific angles or proper nouns (asm instead of wasm for a WebAssembly article).

Links

Citation

@misc{vec2slug2026,
  title={vec2slug: URL Slug Generation from Text Embeddings},
  author={Mahmoud, Bilal and {HASH}},
  year={2026},
  url={https://github.com/hashintel/labs}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including hashintel/vec2slug-v1-openai-small