herMaster's picture
Added README.md
22516f5 verified
---
license: cc-by-nc-4.0
base_model: jinaai/jina-code-embeddings-1.5b
tags:
- embeddings
- code
- gguf
- llama.cpp
- ollama
- vector-search
- retrieval
---
# 🧠 jina-code-embeddings-1.5b β€” GGUF
This repository provides **GGUF-format builds** of
**Jina AI’s `jina-code-embeddings-1.5b`** for efficient local inference using:
- llama.cpp
- LM Studio
- Ollama
- KoboldCpp
- any GGUF-compatible runtime
These files allow you to run a **state-of-the-art code embedding model locally** on CPU or GPU without PyTorch.
## πŸ”Ή Model files
| File | Description |
|------|------------|
| `jina-code-embeddings-1.5b.gguf` | Full precision conversion |
---
## πŸ”— Original model
This is a **format conversion only** of the original Jina AI model:
**Upstream model:**
https://huggingface.co/jinaai/jina-code-embeddings-1.5b
**Paper:**
*Efficient Code Embeddings from Code Generation Models* (Kryvosheieva et al., 2025)
All model weights, training, and research belong to **Jina AI**.
This repository only provides **GGUF format conversions** by **herMaster**.
---
## 🧩 What this model does
This is a **code embedding model**, not a chat LLM.
It generates **vector embeddings** for:
- Text β†’ Code search
- Code β†’ Code similarity
- Code β†’ Text explanation
- Code completion retrieval
- Technical Q&A
It supports **15+ programming languages** and produces **1536-dimensional embeddings** (which can be truncated for smaller vectors).
---
## ⚠️ Important: GGUF usage notes
Unlike the original Transformers version, GGUF engines **do not apply instruction prefixes or pooling automatically**.
To get correct embeddings you must:
1. Add the correct **instruction prefix**
2. Run inference
3. Use the **last token embedding** as the vector
### Example (NL β†’ Code)
Query:
```markdown
Find the most relevant code snippet given the following query:
print hello world in python
```
Candidate code:
```python
Candidate code snippet:
print("Hello world")
```
If you do **not** include the instruction text, embedding quality will be significantly worse.
---
## πŸ›  llama.cpp example (https://github.com/ggml-org/llama.cpp)
```bash
./llama-embedding \
-m jina-code-embeddings-1.5b.gguf \
-p "Find the most relevant code snippet given the following query:
print hello world in python"
```
This returns a 1536-dimension vector you can store in FAISS, Qdrant, Milvus, etc.
## πŸ“œ License
This model is licensed under:
> Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC-4.0)
You may:
- Use it for research
- Use it for personal projects
- Share it freely
You may not:
- Use it in commercial products
- Run it in paid APIs or SaaS
- Sell access to it
This license is inherited from the original Jina AI release.
## πŸ™ Credits
- Model & training: Jina AI
- GGUF conversion: herMaster
All model weights, architecture, and training data belong to Jina AI.
This repository only provides format-converted GGUF files for easier local inference.
If you use this model in academic or technical work, please cite the original Jina AI paper:
> Efficient Code Embeddings from Code Generation Models
> Daria Kryvosheieva, Saba Sturua, Michael GΓΌnther, Scott Martens, Han Xiao (2025)
This ensures proper credit is given to the original authors and helps support continued research in high-quality code embeddings.