---
license: cc-by-nc-4.0
base_model: jinaai/jina-code-embeddings-1.5b
tags:
  - embeddings
  - code
  - gguf
  - llama.cpp
  - ollama
  - vector-search
  - retrieval
---

# 🧠 jina-code-embeddings-1.5b — GGUF

This repository provides **GGUF-format builds** of  
**Jina AI’s `jina-code-embeddings-1.5b`** for efficient local inference using:

- llama.cpp  
- LM Studio  
- Ollama  
- KoboldCpp  
- any GGUF-compatible runtime  

These files allow you to run a **state-of-the-art code embedding model locally** on CPU or GPU without PyTorch.

## 🔹 Model files

| File | Description |
|------|------------|
| `jina-code-embeddings-1.5b.gguf` | Full precision conversion |

---

## 🔗 Original model

This is a **format conversion only** of the original Jina AI model:

**Upstream model:**  
https://huggingface.co/jinaai/jina-code-embeddings-1.5b  

**Paper:**  
*Efficient Code Embeddings from Code Generation Models* (Kryvosheieva et al., 2025)

All model weights, training, and research belong to **Jina AI**.  
This repository only provides **GGUF format conversions** by **herMaster**.

---

## 🧩 What this model does

This is a **code embedding model**, not a chat LLM.

It generates **vector embeddings** for:

- Text → Code search  
- Code → Code similarity  
- Code → Text explanation  
- Code completion retrieval  
- Technical Q&A  

It supports **15+ programming languages** and produces **1536-dimensional embeddings** (which can be truncated for smaller vectors).

---

## ⚠️ Important: GGUF usage notes

Unlike the original Transformers version, GGUF engines **do not apply instruction prefixes or pooling automatically**.

To get correct embeddings you must:

1. Add the correct **instruction prefix**
2. Run inference
3. Use the **last token embedding** as the vector

### Example (NL → Code)

Query:
```markdown
Find the most relevant code snippet given the following query:
print hello world in python
```

Candidate code:

```python
Candidate code snippet:
print("Hello world")
```


If you do **not** include the instruction text, embedding quality will be significantly worse.

---

## 🛠 llama.cpp example (https://github.com/ggml-org/llama.cpp)

```bash
./llama-embedding \
  -m jina-code-embeddings-1.5b.gguf \
  -p "Find the most relevant code snippet given the following query:
print hello world in python"
```

This returns a 1536-dimension vector you can store in FAISS, Qdrant, Milvus, etc.

## 📜 License
This model is licensed under:
> Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC-4.0)

You may:
- Use it for research
- Use it for personal projects
- Share it freely

You may not:
- Use it in commercial products
- Run it in paid APIs or SaaS
- Sell access to it

This license is inherited from the original Jina AI release.

## 🙏 Credits

- Model & training: Jina AI
- GGUF conversion: herMaster

All model weights, architecture, and training data belong to Jina AI.
This repository only provides format-converted GGUF files for easier local inference.

If you use this model in academic or technical work, please cite the original Jina AI paper:
> Efficient Code Embeddings from Code Generation Models
> Daria Kryvosheieva, Saba Sturua, Michael Günther, Scott Martens, Han Xiao (2025)

This ensures proper credit is given to the original authors and helps support continued research in high-quality code embeddings.