ide-code-retrieval-qwen3-0.6b-ebs128
A SentenceTransformer model fine-tuned from Qwen/Qwen3-Embedding-0.6B for IDE code retrieval -- mapping natural-language commit queries to relevant source code documents via dense vector similarity.
Note: This is an intermediate checkpoint at step 8,000 / 8,000 (100.0% through 3 epochs). Training loss is still decreasing, so a later checkpoint may perform better.
Model Description
This model encodes both short natural-language queries (commit messages, search queries) and longer code documents into a shared embedding space. Retrieval is performed by computing cosine similarity between the query embedding and candidate code embeddings.
- Base model: Qwen/Qwen3-Embedding-0.6B (0.6B parameters)
- Max sequence length: 1024 tokens
- Output dimensionality: 1024 (normalized)
- Similarity function: Cosine similarity
Training Details
Dataset
- Source: aysinghal/code-retrieval-training-dataset
- Total pairs: 2,465,694
- Train split: 2,342,409 pairs (95%)
- Eval split: 123,285 pairs (5%)
- Text strategy: truncate (max 4096 chars)
- Negatives: Explicit hard negatives from the dataset
- Pre-tokenized: Yes (token IDs stored on disk for zero-overhead data loading)
Loss Function
MultipleNegativesRankingLoss (InfoNCE) with explicit hard negatives. Each training example consists of an anchor (query), a positive (relevant code), and a hard negative (similar but irrelevant code). In-batch negatives provide additional contrast.
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen3-Embedding-0.6B |
| Learning rate | 2e-05 |
| LR schedule | Linear with warmup |
| Warmup ratio | 0.1 |
| Epochs | 3 |
| Effective batch size | 128 |
| Per-GPU batch size | 64 |
| Gradient accumulation | 1 |
| Max sequence length | 1024 tokens |
| Precision | BFloat16 |
| Gradient checkpointing | True |
| torch.compile | Enabled (max-autotune) |
| Seed | 42 |
| Eval strategy | Every 1600 steps |
| Early stopping patience | 3 |
Hardware
- GPUs: 2x NVIDIA L40S
- Total training steps: 8,000 (3 epochs)
Training Progress (at checkpoint step 8,000)
- Training loss: 2.8207 (step 50) → 0.5410 (step 8000)
- Best eval loss: 0.1174 (step 8,000)
- Progress: 8,000 / 8,000 steps (100.0%)
Evaluation Results
| Step | Epoch | Eval Loss |
|---|---|---|
| 0 | 0.00 | 1.4170 |
| 1,600 | 0.09 | 0.2031 |
| 3,200 | 0.17 | 0.1401 |
| 4,800 | 0.26 | 0.1249 |
| 6,400 | 0.35 | 0.1177 |
| 8,000 | 0.44 | 0.1174 |
Full training loss history (click to expand)
| Step | Epoch | Loss | Learning Rate |
|---|---|---|---|
| 50 | 0.0027 | 2.8207 | 1.23e-06 |
| 100 | 0.0055 | 2.6560 | 2.48e-06 |
| 150 | 0.0082 | 2.1568 | 3.73e-06 |
| 200 | 0.0109 | 1.8428 | 4.98e-06 |
| 250 | 0.0137 | 1.6993 | 6.23e-06 |
| 300 | 0.0164 | 1.5767 | 7.48e-06 |
| 350 | 0.0191 | 1.5175 | 8.73e-06 |
| 400 | 0.0219 | 1.4618 | 9.98e-06 |
| 450 | 0.0246 | 1.3916 | 1.12e-05 |
| 500 | 0.0273 | 1.3362 | 1.25e-05 |
| 550 | 0.0301 | 1.2622 | 1.37e-05 |
| 600 | 0.0328 | 1.2656 | 1.50e-05 |
| 650 | 0.0355 | 1.1825 | 1.62e-05 |
| 700 | 0.0383 | 1.1610 | 1.75e-05 |
| 750 | 0.0410 | 1.1654 | 1.87e-05 |
| 800 | 0.0437 | 1.1153 | 2.00e-05 |
| 850 | 0.0464 | 1.0911 | 1.99e-05 |
| 900 | 0.0492 | 1.0500 | 1.97e-05 |
| 950 | 0.0519 | 1.0255 | 1.96e-05 |
| 1,000 | 0.0546 | 0.9900 | 1.94e-05 |
| 1,050 | 0.0574 | 0.9526 | 1.93e-05 |
| 1,100 | 0.0601 | 0.9093 | 1.92e-05 |
| 1,150 | 0.0628 | 0.9022 | 1.90e-05 |
| 1,200 | 0.0656 | 0.8817 | 1.89e-05 |
| 1,250 | 0.0683 | 0.8606 | 1.88e-05 |
| 1,300 | 0.0710 | 0.8436 | 1.86e-05 |
| 1,350 | 0.0738 | 0.8205 | 1.85e-05 |
| 1,400 | 0.0765 | 0.8097 | 1.83e-05 |
| 1,450 | 0.0792 | 0.8396 | 1.82e-05 |
| 1,500 | 0.0820 | 0.8060 | 1.81e-05 |
| 1,550 | 0.0847 | 0.7803 | 1.79e-05 |
| 1,600 | 0.0874 | 0.8032 | 1.78e-05 |
| 1,650 | 0.0902 | 0.7803 | 1.76e-05 |
| 1,700 | 0.0929 | 0.7606 | 1.75e-05 |
| 1,750 | 0.0956 | 0.7257 | 1.74e-05 |
| 1,800 | 0.0984 | 0.7296 | 1.72e-05 |
| 1,850 | 0.1011 | 0.7458 | 1.71e-05 |
| 1,900 | 0.1038 | 0.7264 | 1.69e-05 |
| 1,950 | 0.1066 | 0.7204 | 1.68e-05 |
| 2,000 | 0.1093 | 0.7349 | 1.67e-05 |
| 2,050 | 0.1120 | 0.7569 | 1.65e-05 |
| 2,100 | 0.1148 | 0.7186 | 1.64e-05 |
| 2,150 | 0.1175 | 0.6933 | 1.63e-05 |
| 2,200 | 0.1202 | 0.7022 | 1.61e-05 |
| 2,250 | 0.1230 | 0.6836 | 1.60e-05 |
| 2,300 | 0.1257 | 0.7262 | 1.58e-05 |
| 2,350 | 0.1284 | 0.6783 | 1.57e-05 |
| 2,400 | 0.1311 | 0.6687 | 1.56e-05 |
| 2,450 | 0.1339 | 0.6666 | 1.54e-05 |
| 2,500 | 0.1366 | 0.6612 | 1.53e-05 |
| 2,550 | 0.1393 | 0.6177 | 1.51e-05 |
| 2,600 | 0.1421 | 0.6545 | 1.50e-05 |
| 2,650 | 0.1448 | 0.6496 | 1.49e-05 |
| 2,700 | 0.1475 | 0.6269 | 1.47e-05 |
| 2,750 | 0.1503 | 0.6503 | 1.46e-05 |
| 2,800 | 0.1530 | 0.6356 | 1.44e-05 |
| 2,850 | 0.1557 | 0.6310 | 1.43e-05 |
| 2,900 | 0.1585 | 0.6256 | 1.42e-05 |
| 2,950 | 0.1612 | 0.6189 | 1.40e-05 |
| 3,000 | 0.1639 | 0.6223 | 1.39e-05 |
| 3,050 | 0.1667 | 0.6119 | 1.38e-05 |
| 3,100 | 0.1694 | 0.6270 | 1.36e-05 |
| 3,150 | 0.1721 | 0.6206 | 1.35e-05 |
| 3,200 | 0.1749 | 0.6451 | 1.33e-05 |
| 3,250 | 0.1776 | 0.6031 | 1.32e-05 |
| 3,300 | 0.1803 | 0.5786 | 1.31e-05 |
| 3,350 | 0.1831 | 0.6212 | 1.29e-05 |
| 3,400 | 0.1858 | 0.6024 | 1.28e-05 |
| 3,450 | 0.1885 | 0.5820 | 1.26e-05 |
| 3,500 | 0.1913 | 0.5876 | 1.25e-05 |
| 3,550 | 0.1940 | 0.6084 | 1.24e-05 |
| 3,600 | 0.1967 | 0.5995 | 1.22e-05 |
| 3,650 | 0.1995 | 0.6103 | 1.21e-05 |
| 3,700 | 0.2022 | 0.5923 | 1.19e-05 |
| 3,750 | 0.2049 | 0.6009 | 1.18e-05 |
| 3,800 | 0.2077 | 0.5871 | 1.17e-05 |
| 3,850 | 0.2104 | 0.5807 | 1.15e-05 |
| 3,900 | 0.2131 | 0.5861 | 1.14e-05 |
| 3,950 | 0.2158 | 0.5724 | 1.13e-05 |
| 4,000 | 0.2186 | 0.5986 | 1.11e-05 |
| 4,050 | 0.2213 | 0.5930 | 1.10e-05 |
| 4,100 | 0.2240 | 0.5759 | 1.08e-05 |
| 4,150 | 0.2268 | 0.5803 | 1.07e-05 |
| 4,200 | 0.2295 | 0.5754 | 1.06e-05 |
| 4,250 | 0.2322 | 0.5641 | 1.04e-05 |
| 4,300 | 0.2350 | 0.5678 | 1.03e-05 |
| 4,350 | 0.2377 | 0.5765 | 1.01e-05 |
| 4,400 | 0.2404 | 0.5586 | 1.00e-05 |
| 4,450 | 0.2432 | 0.5506 | 9.86e-06 |
| 4,500 | 0.2459 | 0.5652 | 9.73e-06 |
| 4,550 | 0.2486 | 0.5462 | 9.59e-06 |
| 4,600 | 0.2514 | 0.5547 | 9.45e-06 |
| 4,650 | 0.2541 | 0.5691 | 9.31e-06 |
| 4,700 | 0.2568 | 0.5688 | 9.17e-06 |
| 4,750 | 0.2596 | 0.5460 | 9.03e-06 |
| 4,800 | 0.2623 | 0.5721 | 8.89e-06 |
| 4,850 | 0.2650 | 0.5646 | 8.75e-06 |
| 4,900 | 0.2678 | 0.5710 | 8.61e-06 |
| 4,950 | 0.2705 | 0.5316 | 8.48e-06 |
| 5,000 | 0.2732 | 0.5359 | 8.34e-06 |
| 5,050 | 0.2760 | 0.5669 | 8.20e-06 |
| 5,100 | 0.2787 | 0.5689 | 8.06e-06 |
| 5,150 | 0.2814 | 0.5392 | 7.92e-06 |
| 5,200 | 0.2842 | 0.5573 | 7.78e-06 |
| 5,250 | 0.2869 | 0.5387 | 7.64e-06 |
| 5,300 | 0.2896 | 0.5497 | 7.50e-06 |
| 5,350 | 0.2923 | 0.5370 | 7.36e-06 |
| 5,400 | 0.2951 | 0.5524 | 7.23e-06 |
| 5,450 | 0.2978 | 0.5466 | 7.09e-06 |
| 5,500 | 0.3005 | 0.5311 | 6.95e-06 |
| 5,550 | 0.3033 | 0.5651 | 6.81e-06 |
| 5,600 | 0.3060 | 0.5147 | 6.67e-06 |
| 5,650 | 0.3087 | 0.5392 | 6.53e-06 |
| 5,700 | 0.3115 | 0.5557 | 6.39e-06 |
| 5,750 | 0.3142 | 0.5419 | 6.25e-06 |
| 5,800 | 0.3169 | 0.5251 | 6.11e-06 |
| 5,850 | 0.3197 | 0.5351 | 5.98e-06 |
| 5,900 | 0.3224 | 0.5373 | 5.84e-06 |
| 5,950 | 0.3251 | 0.5342 | 5.70e-06 |
| 6,000 | 0.3279 | 0.5343 | 5.56e-06 |
| 6,050 | 0.3306 | 0.5382 | 5.42e-06 |
| 6,100 | 0.3333 | 0.5459 | 5.28e-06 |
| 6,150 | 0.3361 | 0.5431 | 5.14e-06 |
| 6,200 | 0.3388 | 0.5261 | 5.00e-06 |
| 6,250 | 0.3415 | 0.5377 | 4.86e-06 |
| 6,300 | 0.3443 | 0.5197 | 4.73e-06 |
| 6,350 | 0.3470 | 0.5165 | 4.59e-06 |
| 6,400 | 0.3497 | 0.5261 | 4.45e-06 |
| 6,450 | 0.3525 | 0.5462 | 4.31e-06 |
| 6,500 | 0.3552 | 0.5273 | 4.17e-06 |
| 6,550 | 0.3579 | 0.5287 | 4.03e-06 |
| 6,600 | 0.3607 | 0.5253 | 3.89e-06 |
| 6,650 | 0.3634 | 0.5343 | 3.75e-06 |
| 6,700 | 0.3661 | 0.5365 | 3.61e-06 |
| 6,750 | 0.3689 | 0.5192 | 3.48e-06 |
| 6,800 | 0.3716 | 0.5311 | 3.34e-06 |
| 6,850 | 0.3743 | 0.5350 | 3.20e-06 |
| 6,900 | 0.3770 | 0.5319 | 3.06e-06 |
| 6,950 | 0.3798 | 0.5321 | 2.92e-06 |
| 7,000 | 0.3825 | 0.5234 | 2.78e-06 |
| 7,050 | 0.3852 | 0.5244 | 2.64e-06 |
| 7,100 | 0.3880 | 0.5315 | 2.50e-06 |
| 7,150 | 0.3907 | 0.5547 | 2.36e-06 |
| 7,200 | 0.3934 | 0.5182 | 2.23e-06 |
| 7,250 | 0.3962 | 0.5301 | 2.09e-06 |
| 7,300 | 0.3989 | 0.5425 | 1.95e-06 |
| 7,350 | 0.4016 | 0.5328 | 1.81e-06 |
| 7,400 | 0.4044 | 0.5103 | 1.67e-06 |
| 7,450 | 0.4071 | 0.5285 | 1.53e-06 |
| 7,500 | 0.4098 | 0.5213 | 1.39e-06 |
| 7,550 | 0.4126 | 0.5168 | 1.25e-06 |
| 7,600 | 0.4153 | 0.5270 | 1.11e-06 |
| 7,650 | 0.4180 | 0.5396 | 9.75e-07 |
| 7,700 | 0.4208 | 0.5267 | 8.36e-07 |
| 7,750 | 0.4235 | 0.5367 | 6.97e-07 |
| 7,800 | 0.4262 | 0.5315 | 5.58e-07 |
| 7,850 | 0.4290 | 0.5210 | 4.19e-07 |
| 7,900 | 0.4317 | 0.5263 | 2.81e-07 |
| 7,950 | 0.4344 | 0.5501 | 1.42e-07 |
| 8,000 | 0.4372 | 0.5410 | 2.78e-09 |
Usage
Loading the Model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("aysinghal/ide-code-retrieval-qwen3-0.6b-ebs128")
Computing Embeddings
queries = [
"fix null pointer exception in user authentication",
"add retry logic to API client",
]
code_docs = [
"def authenticate(user):\n if user is None:\n raise ValueError...",
"class APIClient:\n def request(self, url, retries=3):\n ...",
]
query_embeddings = model.encode(queries)
code_embeddings = model.encode(code_docs)
# Compute cosine similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(query_embeddings, code_embeddings)
print(similarities)
Intended Use
- Primary use case: Retrieving relevant code files/functions given a natural-language query (commit message, bug description, feature request)
- Search pipeline: Encode a corpus of code documents offline, then at query time encode the query and find nearest neighbors via cosine similarity
Limitations
- This is an early checkpoint (100.0% through training). The loss curve is still decreasing, so later checkpoints will likely perform better.
- Trained on a specific code retrieval dataset; may not generalize to all programming languages or query styles without further fine-tuning.
- Max context is 1024 tokens -- very long files are truncated.
Citation
If you use this model, please cite the base model:
@article{qwen3embedding,
title={Qwen3-Embedding},
author={Qwen Team},
year={2025}
}
- Downloads last month
- 61