OmniEM-EN v1

Recursive kernel-native sentence embedder: GOAT-V (no Q/K) + per-head b/eps, Yat-MLP, geodesic-momentum skip (exp-map + parallel transport on the unit hypersphere), PonderNet halting, anytime contrastive (MNRL at every recursion step). Single tied recursive block (K=1, LMAX=6), trained from scratch (no teacher) on English all-NLI (557k pairs); embeddings warm-started from intfloat/multilingual-e5-small (multilingual XLM-R tokenizer). d=384.

Key property โ€” adaptive compute is ~free: the learned halting exits at E[depth]โ‰ˆ1.05, and depth-1 is the strongest exit; deeper recursion does not help (K=1 optimal).

Benchmarks (Spearman for STS; nDCG@10; accuracy)

model STSB SICK-R BIOSSES SciFact nDCG@10 Banking77 acc
OmniEM(depth1) 0.7253 0.655 0.6865 0.3468 0.8521
OmniEM(hard-exit) 0.7165 0.6422 0.6865 0.3468 0.8521
multilingual-e5-small 0.8359 0.7863 0.8438 0.6694 0.8339
all-MiniLM-L6-v2 0.8203 0.7758 0.8164 0.6451 0.9083
bge-small-en-v1.5 0.8586 0.7941 0.8375 0.72 0.9064

Usage

See omniem_model.py (Student/GOATV/YatMLP). Load omniem_en_best.pt ({"model":state_dict,"config":...}), warm tokenizer/embeddings from intfloat/multilingual-e5-small, mean-pool the depth-1 (or hard-exit) output, L2-normalize.

Trained on Kaggle 2xT4. Full benchmark JSON: omniem_benchmarks.json.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support