ide-code-retrieval-qwen3-0.6b-inbatch

A SentenceTransformer model fine-tuned from Qwen/Qwen3-Embedding-0.6B for IDE code retrieval -- mapping natural-language commit queries to relevant source code documents via dense vector similarity.

Note: This is an intermediate checkpoint at step 9,000 / 9,150 (98.4% through 3 epochs). Training loss is still decreasing, so a later checkpoint may perform better.

Model Description

This model encodes both short natural-language queries (commit messages, search queries) and longer code documents into a shared embedding space. Retrieval is performed by computing cosine similarity between the query embedding and candidate code embeddings.

  • Base model: Qwen/Qwen3-Embedding-0.6B (0.6B parameters)
  • Max sequence length: 1024 tokens
  • Output dimensionality: 1024 (normalized)
  • Similarity function: Cosine similarity

Training Details

Dataset

  • Source: aysinghal/code-retrieval-training-dataset
  • Total pairs: 5,072,176
  • Train split: 4,818,567 pairs (95%)
  • Eval split: 253,609 pairs (5%)
  • Text strategy: truncate (max 4096 chars)
  • Negatives: Explicit hard negatives from the dataset
  • Pre-tokenized: Yes (token IDs stored on disk for zero-overhead data loading)

Loss Function

MultipleNegativesRankingLoss (InfoNCE) with explicit hard negatives. Each training example consists of an anchor (query), a positive (relevant code), and a hard negative (similar but irrelevant code). In-batch negatives provide additional contrast.

Hyperparameters

Parameter Value
Base model Qwen/Qwen3-Embedding-0.6B
Learning rate 2e-05
LR schedule Linear with warmup
Warmup ratio 0.1
Epochs 3
Effective batch size 256
Per-GPU batch size 32
Gradient accumulation 1
Max sequence length 1024 tokens
Precision BFloat16
Gradient checkpointing True
torch.compile Enabled (max-autotune)
Seed 42
Eval strategy Every 1882 steps
Early stopping patience 3

Hardware

  • GPUs: 8x NVIDIA L40S
  • Total training steps: 9,150 (3 epochs)

Training Progress (at checkpoint step 9,000)

  • Training loss: 1.1880 (step 50) → 0.1628 (step 9000)
  • Best eval loss: 0.0443 (step 7,528)
  • Progress: 9,000 / 9,150 steps (98.4%)

Evaluation Results

Step Epoch Eval Loss
0 0.00 0.5361
1,882 0.10 0.0781
3,764 0.20 0.0587
5,646 0.30 0.0501
7,528 0.40 0.0443
Full training loss history (click to expand)
Step Epoch Loss Learning Rate
50 0.0027 1.1880 1.07e-06
100 0.0053 0.7126 2.16e-06
150 0.0080 0.5812 3.26e-06
200 0.0106 0.5347 4.35e-06
250 0.0133 0.5104 5.44e-06
300 0.0159 0.4721 6.54e-06
350 0.0186 0.4572 7.63e-06
400 0.0213 0.4318 8.72e-06
450 0.0239 0.4221 9.81e-06
500 0.0266 0.4056 1.09e-05
550 0.0292 0.4013 1.20e-05
600 0.0319 0.3933 1.31e-05
650 0.0345 0.3729 1.42e-05
700 0.0372 0.4061 1.53e-05
750 0.0398 0.3672 1.64e-05
800 0.0425 0.3818 1.75e-05
850 0.0452 0.3498 1.86e-05
900 0.0478 0.3665 1.97e-05
950 0.0505 0.3628 1.99e-05
1,000 0.0531 0.3575 1.98e-05
1,050 0.0558 0.3550 1.97e-05
1,100 0.0584 0.3419 1.96e-05
1,150 0.0611 0.3471 1.94e-05
1,200 0.0638 0.3403 1.93e-05
1,250 0.0664 0.3309 1.92e-05
1,300 0.0691 0.3252 1.91e-05
1,350 0.0717 0.3187 1.89e-05
1,400 0.0744 0.3237 1.88e-05
1,450 0.0770 0.3239 1.87e-05
1,500 0.0797 0.3205 1.86e-05
1,550 0.0824 0.3141 1.85e-05
1,600 0.0850 0.3140 1.83e-05
1,650 0.0877 0.2961 1.82e-05
1,700 0.0903 0.3050 1.81e-05
1,750 0.0930 0.2959 1.80e-05
1,800 0.0956 0.3067 1.79e-05
1,850 0.0983 0.3060 1.77e-05
1,900 0.1009 0.2951 1.76e-05
1,950 0.1036 0.2862 1.75e-05
2,000 0.1063 0.2986 1.74e-05
2,050 0.1089 0.2807 1.72e-05
2,100 0.1116 0.2778 1.71e-05
2,150 0.1142 0.2857 1.70e-05
2,200 0.1169 0.2862 1.69e-05
2,250 0.1195 0.2651 1.68e-05
2,300 0.1222 0.2802 1.66e-05
2,350 0.1249 0.2834 1.65e-05
2,400 0.1275 0.2733 1.64e-05
2,450 0.1302 0.2607 1.63e-05
2,500 0.1328 0.2642 1.62e-05
2,550 0.1355 0.2713 1.60e-05
2,600 0.1381 0.2653 1.59e-05
2,650 0.1408 0.2583 1.58e-05
2,700 0.1434 0.2447 1.57e-05
2,750 0.1461 0.2550 1.55e-05
2,800 0.1488 0.2506 1.54e-05
2,850 0.1514 0.2539 1.53e-05
2,900 0.1541 0.2589 1.52e-05
2,950 0.1567 0.2504 1.51e-05
3,000 0.1594 0.2384 1.49e-05
3,050 0.1620 0.2525 1.48e-05
3,100 0.1647 0.2427 1.47e-05
3,150 0.1674 0.2365 1.46e-05
3,200 0.1700 0.2447 1.45e-05
3,250 0.1727 0.2423 1.43e-05
3,300 0.1753 0.2410 1.42e-05
3,350 0.1780 0.2443 1.41e-05
3,400 0.1806 0.2329 1.40e-05
3,450 0.1833 0.2380 1.38e-05
3,500 0.1860 0.2331 1.37e-05
3,550 0.1886 0.2446 1.36e-05
3,600 0.1913 0.2290 1.35e-05
3,650 0.1939 0.2303 1.34e-05
3,700 0.1966 0.2216 1.32e-05
3,750 0.1992 0.2321 1.31e-05
3,800 0.2019 0.2365 1.30e-05
3,850 0.2045 0.2365 1.29e-05
3,900 0.2072 0.2277 1.28e-05
3,950 0.2099 0.2262 1.26e-05
4,000 0.2125 0.2304 1.25e-05
4,050 0.2152 0.2228 1.24e-05
4,100 0.2178 0.2232 1.23e-05
4,150 0.2205 0.2239 1.21e-05
4,200 0.2231 0.2246 1.20e-05
4,250 0.2258 0.2103 1.19e-05
4,300 0.2285 0.2240 1.18e-05
4,350 0.2311 0.2183 1.17e-05
4,400 0.2338 0.2195 1.15e-05
4,450 0.2364 0.2205 1.14e-05
4,500 0.2391 0.2196 1.13e-05
4,550 0.2417 0.2165 1.12e-05
4,600 0.2444 0.2107 1.11e-05
4,650 0.2471 0.2142 1.09e-05
4,700 0.2497 0.2187 1.08e-05
4,750 0.2524 0.2121 1.07e-05
4,800 0.2550 0.2088 1.06e-05
4,850 0.2577 0.2139 1.04e-05
4,900 0.2603 0.2142 1.03e-05
4,950 0.2630 0.2149 1.02e-05
5,000 0.2656 0.2026 1.01e-05
5,050 0.2683 0.2097 9.96e-06
5,100 0.2710 0.2111 9.84e-06
5,150 0.2736 0.2116 9.72e-06
5,200 0.2763 0.2063 9.60e-06
5,250 0.2789 0.2071 9.47e-06
5,300 0.2816 0.2001 9.35e-06
5,350 0.2842 0.2002 9.23e-06
5,400 0.2869 0.2020 9.11e-06
5,450 0.2896 0.2043 8.99e-06
5,500 0.2922 0.1896 8.87e-06
5,550 0.2949 0.2020 8.75e-06
5,600 0.2975 0.1940 8.62e-06
5,650 0.3002 0.2000 8.50e-06
5,700 0.3028 0.1937 8.38e-06
5,750 0.3055 0.2105 8.26e-06
5,800 0.3082 0.1944 8.14e-06
5,850 0.3108 0.1885 8.02e-06
5,900 0.3135 0.2005 7.90e-06
5,950 0.3161 0.1883 7.77e-06
6,000 0.3188 0.1970 7.65e-06
6,050 0.3214 0.1857 7.53e-06
6,100 0.3241 0.1858 7.41e-06
6,150 0.3267 0.1809 7.29e-06
6,200 0.3294 0.1891 7.17e-06
6,250 0.3321 0.1834 7.05e-06
6,300 0.3347 0.1865 6.92e-06
6,350 0.3374 0.1938 6.80e-06
6,400 0.3400 0.1995 6.68e-06
6,450 0.3427 0.1938 6.56e-06
6,500 0.3453 0.1980 6.44e-06
6,550 0.3480 0.1833 6.32e-06
6,600 0.3507 0.1889 6.20e-06
6,650 0.3533 0.1881 6.07e-06
6,700 0.3560 0.1923 5.95e-06
6,750 0.3586 0.1935 5.83e-06
6,800 0.3613 0.1874 5.71e-06
6,850 0.3639 0.1807 5.59e-06
6,900 0.3666 0.1848 5.47e-06
6,950 0.3692 0.1841 5.35e-06
7,000 0.3719 0.1656 5.22e-06
7,050 0.3746 0.1861 5.10e-06
7,100 0.3772 0.1915 4.98e-06
7,150 0.3799 0.1858 4.86e-06
7,200 0.3825 0.1856 4.74e-06
7,250 0.3852 0.1753 4.62e-06
7,300 0.3878 0.1768 4.50e-06
7,350 0.3905 0.1765 4.37e-06
7,400 0.3932 0.1839 4.25e-06
7,450 0.3958 0.1813 4.13e-06
7,500 0.3985 0.1881 4.01e-06
7,550 0.4011 0.1776 3.89e-06
7,600 0.4038 0.1764 3.77e-06
7,650 0.4064 0.1765 3.65e-06
7,700 0.4091 0.1856 3.52e-06
7,750 0.4118 0.1761 3.40e-06
7,800 0.4144 0.1742 3.28e-06
7,850 0.4171 0.1735 3.16e-06
7,900 0.4197 0.1796 3.04e-06
7,950 0.4224 0.1808 2.92e-06
8,000 0.4250 0.1743 2.80e-06
8,050 0.4277 0.1709 2.67e-06
8,100 0.4303 0.1642 2.55e-06
8,150 0.4330 0.1690 2.43e-06
8,200 0.4357 0.1793 2.31e-06
8,250 0.4383 0.1732 2.19e-06
8,300 0.4410 0.1689 2.07e-06
8,350 0.4436 0.1675 1.95e-06
8,400 0.4463 0.1619 1.82e-06
8,450 0.4489 0.1713 1.70e-06
8,500 0.4516 0.1735 1.58e-06
8,550 0.4543 0.1682 1.46e-06
8,600 0.4569 0.1717 1.34e-06
8,650 0.4596 0.1601 1.22e-06
8,700 0.4622 0.1722 1.10e-06
8,750 0.4649 0.1674 9.74e-07
8,800 0.4675 0.1704 8.52e-07
8,850 0.4702 0.1684 7.31e-07
8,900 0.4729 0.1645 6.10e-07
8,950 0.4755 0.1717 4.88e-07
9,000 0.4782 0.1628 3.67e-07

Usage

Loading the Model

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch")

Computing Embeddings

queries = [
    "fix null pointer exception in user authentication",
    "add retry logic to API client",
]
code_docs = [
    "def authenticate(user):\n    if user is None:\n        raise ValueError...",
    "class APIClient:\n    def request(self, url, retries=3):\n        ...",
]

query_embeddings = model.encode(queries)
code_embeddings = model.encode(code_docs)

# Compute cosine similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(query_embeddings, code_embeddings)
print(similarities)

Intended Use

  • Primary use case: Retrieving relevant code files/functions given a natural-language query (commit message, bug description, feature request)
  • Search pipeline: Encode a corpus of code documents offline, then at query time encode the query and find nearest neighbors via cosine similarity

Limitations

  • This is an early checkpoint (98.4% through training). The loss curve is still decreasing, so later checkpoints will likely perform better.
  • Trained on a specific code retrieval dataset; may not generalize to all programming languages or query styles without further fine-tuning.
  • Max context is 1024 tokens -- very long files are truncated.

Citation

If you use this model, please cite the base model:

@article{qwen3embedding,
  title={Qwen3-Embedding},
  author={Qwen Team},
  year={2025}
}
Downloads last month
228
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch

Finetuned
(195)
this model

Dataset used to train aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch