Ziv Embedder — Code Aware (ONNX)

This is an ONNX export of Alibaba-NLP/gte-modernbert-base, prepared for Ziv — a local semantic code search engine for Python repositories.

Ziv uses this embedder to improve code-aware search quality in version 0.4.0. The model is optimized for local inference with onnxruntime, making it lightweight, fast, and practical for offline developer workflows.

Why this model?

Ziv needs embeddings that work well for code search and code understanding while staying fully local. This model is designed to support that goal with:

Code-aware semantic search
Fast local inference
No cloud dependency
No API keys
ONNX runtime compatibility

Compared to a standard Python-based embedding stack, this setup is easier to ship and more efficient to run inside a local developer tool.

Usage with Ziv

ziv init --model code
ziv start

Model details

Property	Value
Base model	Alibaba-NLP/gte-modernbert-base
Model type	Text embedding
Embedding dimension	768
Max sequence length	8192
Runtime	onnxruntime
Primary use	Semantic code search / code understanding

Files

File	Description
`model.onnx`	ONNX model weights and graph
`tokenizer.json`	Tokenizer vocabulary and rules
`tokenizer_config.json`	Tokenizer settings
`config.json`	Model architecture config
`1_Pooling/config.json`	Pooling configuration

Relation to the original model

This model is based on Alibaba-NLP/gte-modernbert-base, developed by Tongyi Lab, Alibaba Group.

This repository does not claim ownership of the original model weights or training recipe. It provides an ONNX-exported runtime version tailored for Ziv and local inference.

The original model and its concepts should be credited to:

Tongyi Lab, Alibaba Group
The gte-modernbert model authors
The broader Sentence Transformers ecosystem

License

This model is released under the Apache 2.0 License, consistent with the upstream model license.

Original model: Alibaba-NLP/gte-modernbert-base

Citation

If you use this model or the upstream base model in your work, please cite the original paper:

@inproceedings{zhang2024mgte,
  title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},
  author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others},
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
  pages={1393--1412},
  year={2024}
}

@article{li2023towards,
  title={Towards general text embeddings with multi-stage contrastive learning},
  author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
  journal={arXiv preprint arXiv:2308.03281},
  year={2023}
}

Downloads last month: 5

Model tree for ziv-ai/embedder-code-onnx

Base model

answerdotai/ModernBERT-base

Finetuned

Alibaba-NLP/gte-modernbert-base

Quantized

(12)

this model

Paper for ziv-ai/embedder-code-onnx

Towards General Text Embeddings with Multi-stage Contrastive Learning

Paper • 2308.03281 • Published Aug 7, 2023 • 3