| --- |
| license: cc-by-nc-sa-4.0 |
| tags: |
| - sentence-transformers |
| - transformers |
| - splade |
| - sparse-encoder |
| - code |
| pipeline_tag: feature-extraction |
| --- |
| |
| SPLADE-Code-06B is a sparse retrieval model designed for code retrieval tasks. It is the top-performing models on MTEB for models below 1B (at time of writing, Feb 2026). |
|
|
| ## Usage |
|
|
| ### Using Sentence Transformers |
|
|
| Install Sentence Transformers: |
| ```bash |
| pip install sentence_transformers |
| ``` |
|
|
| ```python |
| from sentence_transformers import SparseEncoder |
| |
| model = SparseEncoder("naver/splade-code-06B", trust_remote_code=True) |
| |
| queries = [ |
| "SELECT *\nFROM Student\nWHERE Age = (\nSELECT MAX(Age)\nFROM Student\nWHERE Group = 'specific_group'\n)\nAND Group = 'specific_group';" |
| ] |
| |
| query_embeddings = model.encode(queries) |
| print(query_embeddings.shape) |
| # torch.Size([1, 151936]) |
| |
| sparsity = model.sparsity(query_embeddings) |
| print(sparsity) |
| # {'active_dims': 1231.0, 'sparsity_ratio': 0.991897904380792} |
| |
| decoded = model.decode(query_embeddings, top_k=10) |
| print(decoded) |
| # [[ |
| # ("Δ group", 2.34375), |
| # ("Δ age", 2.34375), |
| # ("Δ Age", 2.34375), |
| # ("Δ Student", 2.296875), |
| # ("Δ specific", 2.296875), |
| # ("_group", 2.296875), |
| # ("Δ Max", 2.21875), |
| # ("Δ max", 2.21875), |
| # ("Δ student", 2.203125), |
| # ("Δ Group", 2.1875), |
| # ]] |
| ``` |
|
|
| ### Using Transformers |
|
|
| ```bash |
| pip install transformers |
| ``` |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoModel |
| import os |
| import torch |
| |
| splade = AutoModelForCausalLM.from_pretrained("naver/splade-code-06B", trust_remote_code=True) |
| device = (torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")) |
| splade.to(device) |
| splade.eval() |
| queries = ["SELECT *\nFROM Student\nWHERE Age = (\nSELECT MAX(Age)\nFROM Student\nWHERE Group = 'specific_group'\n)\nAND Group = 'specific_group';"] |
| bow_dict = splade.encode(queries, prompt_type="query", top_k_q=10, return_dict=True, print_dict=True) |
| ``` |
|
|
| ``` |
| +--------------------------------------------------------------------+ |
| | TOP ACTIVATED WORDS | |
| +--------------------------------------------------------------------+ |
| |
| |
| * INPUT: SELECT * |
| FROM Student |
| WHERE Age = ( |
| SELECT MAX(Age) |
| FROM Student |
| WHERE Group = 'specific_group' |
| ) |
| AND Group = 'specific_group'; |
| |
| Δ group | ββββββββββββββββββββ 2.34 |
| Δ age | βββββββββββββββββββ 2.33 |
| Δ Age | βββββββββββββββββββ 2.33 |
| _group | βββββββββββββββββββ 2.30 |
| Δ Student | βββββββββββββββββββ 2.30 |
| Δ specific | βββββββββββββββββββ 2.28 |
| Δ max | ββββββββββββββββββ 2.22 |
| Δ Max | ββββββββββββββββββ 2.22 |
| Δ student | ββββββββββββββββββ 2.20 |
| Δ Group | ββββββββββββββββββ 2.19 |
| ``` |