metadata
library_name: mlx
tags:
- mlx
- embeddings
- jina
- jina-code-embeddings
- feature-extraction
- code
pipeline_tag: feature-extraction
license: cc-by-nc-4.0
base_model: jinaai/jina-code-embeddings-0.5b
Jina Code Embeddings 0.5B - MLX
MLX port of Jina AI's code embedding model for Apple Silicon.
Installation
pip install mlx tokenizers huggingface_hub
Usage
import mlx.core as mx
from tokenizers import Tokenizer
from model import JinaCodeEmbeddingModel
import json
# Load config
with open("config.json") as f:
config = json.load(f)
# Load model
model = JinaCodeEmbeddingModel(config)
weights = mx.load("model.safetensors")
model.load_weights(list(weights.items()))
mx.eval(model.parameters())
# Load tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
# Encode a natural language query for code search
query_embeddings = model.encode(
["print hello world in python"],
tokenizer,
task="nl2code",
prompt_type="query",
)
# Encode code passages
code_embeddings = model.encode(
["print('Hello World!')"],
tokenizer,
task="nl2code",
prompt_type="passage",
)
mx.eval(query_embeddings, code_embeddings)
Task Types
Each task uses specific prefixes for queries and passages:
| Task | Query Prefix | Passage Prefix |
|---|---|---|
| nl2code | Find the most relevant code snippet given the following query: | Candidate code snippet: |
| qa | Find the most relevant answer given the following question: | Candidate answer: |
| code2code | Find an equivalent code snippet given the following code snippet: | Candidate code snippet: |
| code2nl | Find the most relevant comment given the following code snippet: | Candidate comment: |
| code2completion | Find the most relevant completion given the following start of code snippet: | Candidate completion: |
Matryoshka Dimensions
Supports Matryoshka embedding truncation to: 64, 128, 256, 512, 896
embeddings = model.encode(texts, tokenizer, task="nl2code", prompt_type="query", truncate_dim=256)
Model Details
- Architecture: Qwen2.5-Coder-0.5B
- Parameters: 0.49B
- Embedding dimension: 896
- Max sequence length: 32768 tokens
- Languages: 15+ programming languages
- Optimized for: Apple Silicon (M1/M2/M3/M4) with Metal acceleration
Files
jina-code-embeddings-0.5b-mlx/
βββ model.safetensors # Model weights (float16)
βββ model.py # Model implementation
βββ config.json # Model configuration
βββ tokenizer.json # Tokenizer
βββ tokenizer_config.json
βββ vocab.json
βββ merges.txt
βββ README.md
Citation
@misc{kryvosheieva2025efficientcodeembeddingscode,
title={Efficient Code Embeddings from Code Generation Models},
author={Daria Kryvosheieva and Saba Sturua and Michael G\"unther and Scott Martens and Han Xiao},
year={2025},
eprint={2508.21290},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.21290},
}
License
CC BY-NC 4.0
Links
- Original PyTorch model: jinaai/jina-code-embeddings-0.5b
- 1.5B MLX variant: jinaai/jina-code-embeddings-1.5b-mlx
- Jina AI: https://jina.ai