--- license: cc-by-nc-4.0 base_model: jinaai/jina-code-embeddings-1.5b tags: - embeddings - code - gguf - llama.cpp - ollama - vector-search - retrieval --- # 🧠 jina-code-embeddings-1.5b β€” GGUF This repository provides **GGUF-format builds** of **Jina AI’s `jina-code-embeddings-1.5b`** for efficient local inference using: - llama.cpp - LM Studio - Ollama - KoboldCpp - any GGUF-compatible runtime These files allow you to run a **state-of-the-art code embedding model locally** on CPU or GPU without PyTorch. ## πŸ”Ή Model files | File | Description | |------|------------| | `jina-code-embeddings-1.5b.gguf` | Full precision conversion | --- ## πŸ”— Original model This is a **format conversion only** of the original Jina AI model: **Upstream model:** https://huggingface.co/jinaai/jina-code-embeddings-1.5b **Paper:** *Efficient Code Embeddings from Code Generation Models* (Kryvosheieva et al., 2025) All model weights, training, and research belong to **Jina AI**. This repository only provides **GGUF format conversions** by **herMaster**. --- ## 🧩 What this model does This is a **code embedding model**, not a chat LLM. It generates **vector embeddings** for: - Text β†’ Code search - Code β†’ Code similarity - Code β†’ Text explanation - Code completion retrieval - Technical Q&A It supports **15+ programming languages** and produces **1536-dimensional embeddings** (which can be truncated for smaller vectors). --- ## ⚠️ Important: GGUF usage notes Unlike the original Transformers version, GGUF engines **do not apply instruction prefixes or pooling automatically**. To get correct embeddings you must: 1. Add the correct **instruction prefix** 2. Run inference 3. Use the **last token embedding** as the vector ### Example (NL β†’ Code) Query: ```markdown Find the most relevant code snippet given the following query: print hello world in python ``` Candidate code: ```python Candidate code snippet: print("Hello world") ``` If you do **not** include the instruction text, embedding quality will be significantly worse. --- ## πŸ›  llama.cpp example (https://github.com/ggml-org/llama.cpp) ```bash ./llama-embedding \ -m jina-code-embeddings-1.5b.gguf \ -p "Find the most relevant code snippet given the following query: print hello world in python" ``` This returns a 1536-dimension vector you can store in FAISS, Qdrant, Milvus, etc. ## πŸ“œ License This model is licensed under: > Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC-4.0) You may: - Use it for research - Use it for personal projects - Share it freely You may not: - Use it in commercial products - Run it in paid APIs or SaaS - Sell access to it This license is inherited from the original Jina AI release. ## πŸ™ Credits - Model & training: Jina AI - GGUF conversion: herMaster All model weights, architecture, and training data belong to Jina AI. This repository only provides format-converted GGUF files for easier local inference. If you use this model in academic or technical work, please cite the original Jina AI paper: > Efficient Code Embeddings from Code Generation Models > Daria Kryvosheieva, Saba Sturua, Michael GΓΌnther, Scott Martens, Han Xiao (2025) This ensures proper credit is given to the original authors and helps support continued research in high-quality code embeddings.