File size: 3,211 Bytes
77f3bd7 d71d855 691a66c d71d855 77f3bd7 51ce61b d71d855 51ce61b d71d855 77f3bd7 d71d855 77f3bd7 d71d855 77f3bd7 691a66c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | ---
license: cc-by-nc-sa-4.0
tags:
- sentence-transformers
- transformers
- splade
- sparse-encoder
- code
pipeline_tag: feature-extraction
---
SPLADE-Code-06B is a sparse retrieval model designed for code retrieval tasks. It is the top-performing models on MTEB for models below 1B (at time of writing, Feb 2026).
## Usage
### Using Sentence Transformers
Install Sentence Transformers:
```bash
pip install sentence_transformers
```
```python
from sentence_transformers import SparseEncoder
model = SparseEncoder("naver/splade-code-06B", trust_remote_code=True)
queries = [
"SELECT *\nFROM Student\nWHERE Age = (\nSELECT MAX(Age)\nFROM Student\nWHERE Group = 'specific_group'\n)\nAND Group = 'specific_group';"
]
query_embeddings = model.encode(queries)
print(query_embeddings.shape)
# torch.Size([1, 151936])
sparsity = model.sparsity(query_embeddings)
print(sparsity)
# {'active_dims': 1231.0, 'sparsity_ratio': 0.991897904380792}
decoded = model.decode(query_embeddings, top_k=10)
print(decoded)
# [[
# ("Δ group", 2.34375),
# ("Δ age", 2.34375),
# ("Δ Age", 2.34375),
# ("Δ Student", 2.296875),
# ("Δ specific", 2.296875),
# ("_group", 2.296875),
# ("Δ Max", 2.21875),
# ("Δ max", 2.21875),
# ("Δ student", 2.203125),
# ("Δ Group", 2.1875),
# ]]
```
### Using Transformers
```bash
pip install transformers
```
```python
from transformers import AutoModelForCausalLM, AutoModel
import os
import torch
splade = AutoModelForCausalLM.from_pretrained("naver/splade-code-06B", trust_remote_code=True)
device = (torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu"))
splade.to(device)
splade.eval()
queries = ["SELECT *\nFROM Student\nWHERE Age = (\nSELECT MAX(Age)\nFROM Student\nWHERE Group = 'specific_group'\n)\nAND Group = 'specific_group';"]
bow_dict = splade.encode(queries, prompt_type="query", top_k_q=10, return_dict=True, print_dict=True)
```
```
+--------------------------------------------------------------------+
| TOP ACTIVATED WORDS |
+--------------------------------------------------------------------+
* INPUT: SELECT *
FROM Student
WHERE Age = (
SELECT MAX(Age)
FROM Student
WHERE Group = 'specific_group'
)
AND Group = 'specific_group';
Δ group | ββββββββββββββββββββ 2.34
Δ age | βββββββββββββββββββ 2.33
Δ Age | βββββββββββββββββββ 2.33
_group | βββββββββββββββββββ 2.30
Δ Student | βββββββββββββββββββ 2.30
Δ specific | βββββββββββββββββββ 2.28
Δ max | ββββββββββββββββββ 2.22
Δ Max | ββββββββββββββββββ 2.22
Δ student | ββββββββββββββββββ 2.20
Δ Group | ββββββββββββββββββ 2.19
``` |