File size: 3,211 Bytes
77f3bd7
 
d71d855
 
691a66c
d71d855
 
 
 
77f3bd7
 
51ce61b
 
d71d855
51ce61b
d71d855
77f3bd7
d71d855
 
 
77f3bd7
d71d855
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77f3bd7
691a66c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
license: cc-by-nc-sa-4.0
tags:
  - sentence-transformers
  - transformers
  - splade
  - sparse-encoder
  - code
pipeline_tag: feature-extraction
---

SPLADE-Code-06B is a sparse retrieval model designed for code retrieval tasks. It is the top-performing models on MTEB for models below 1B (at time of writing, Feb 2026).

## Usage

### Using Sentence Transformers

Install Sentence Transformers:
```bash
pip install sentence_transformers
```

```python
from sentence_transformers import SparseEncoder

model = SparseEncoder("naver/splade-code-06B", trust_remote_code=True)

queries = [
    "SELECT *\nFROM Student\nWHERE Age = (\nSELECT MAX(Age)\nFROM Student\nWHERE Group = 'specific_group'\n)\nAND Group = 'specific_group';"
]

query_embeddings = model.encode(queries)
print(query_embeddings.shape)
# torch.Size([1, 151936])

sparsity = model.sparsity(query_embeddings)
print(sparsity)
# {'active_dims': 1231.0, 'sparsity_ratio': 0.991897904380792}

decoded = model.decode(query_embeddings, top_k=10)
print(decoded)
# [[
#     ("Δ group", 2.34375),
#     ("Δ age", 2.34375),
#     ("Δ Age", 2.34375),
#     ("Δ Student", 2.296875),
#     ("Δ specific", 2.296875),
#     ("_group", 2.296875),
#     ("Δ Max", 2.21875),
#     ("Δ max", 2.21875),
#     ("Δ student", 2.203125),
#     ("Δ Group", 2.1875),
# ]]
```

### Using Transformers

```bash
pip install transformers
```

```python
from transformers import AutoModelForCausalLM, AutoModel
import os
import torch

splade = AutoModelForCausalLM.from_pretrained("naver/splade-code-06B", trust_remote_code=True)
device = (torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu"))
splade.to(device)
splade.eval()
queries = ["SELECT *\nFROM Student\nWHERE Age = (\nSELECT MAX(Age)\nFROM Student\nWHERE Group = 'specific_group'\n)\nAND Group = 'specific_group';"]
bow_dict = splade.encode(queries, prompt_type="query", top_k_q=10, return_dict=True, print_dict=True)
```

```
+--------------------------------------------------------------------+
|                        TOP ACTIVATED WORDS                         |
+--------------------------------------------------------------------+


* INPUT: SELECT *
FROM Student
WHERE Age = (
SELECT MAX(Age)
FROM Student
WHERE Group = 'specific_group'
)
AND Group = 'specific_group';

Δ group                    | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.34
Δ age                      | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.33
Δ Age                      | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.33
_group                    | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.30
Δ Student                  | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.30
Δ specific                 | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.28
Δ max                      | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.22
Δ Max                      | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.22
Δ student                  | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.20
Δ Group                    | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 2.19
```