Instructions to use aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
ide-code-retrieval-qwen3-0.6b-inbatch
A SentenceTransformer model fine-tuned from Qwen/Qwen3-Embedding-0.6B for IDE code retrieval -- mapping natural-language commit queries to relevant source code documents via dense vector similarity.
Note: This is an intermediate checkpoint at step 9,000 / 9,150 (98.4% through 3 epochs). Training loss is still decreasing, so a later checkpoint may perform better.
Model Description
This model encodes both short natural-language queries (commit messages, search queries) and longer code documents into a shared embedding space. Retrieval is performed by computing cosine similarity between the query embedding and candidate code embeddings.
- Base model: Qwen/Qwen3-Embedding-0.6B (0.6B parameters)
- Max sequence length: 1024 tokens
- Output dimensionality: 1024 (normalized)
- Similarity function: Cosine similarity
Training Details
Dataset
- Source: aysinghal/code-retrieval-training-dataset
- Total pairs: 5,072,176
- Train split: 4,818,567 pairs (95%)
- Eval split: 253,609 pairs (5%)
- Text strategy: truncate (max 4096 chars)
- Negatives: Explicit hard negatives from the dataset
- Pre-tokenized: Yes (token IDs stored on disk for zero-overhead data loading)
Loss Function
MultipleNegativesRankingLoss (InfoNCE) with explicit hard negatives. Each training example consists of an anchor (query), a positive (relevant code), and a hard negative (similar but irrelevant code). In-batch negatives provide additional contrast.
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen3-Embedding-0.6B |
| Learning rate | 2e-05 |
| LR schedule | Linear with warmup |
| Warmup ratio | 0.1 |
| Epochs | 3 |
| Effective batch size | 256 |
| Per-GPU batch size | 32 |
| Gradient accumulation | 1 |
| Max sequence length | 1024 tokens |
| Precision | BFloat16 |
| Gradient checkpointing | True |
| torch.compile | Enabled (max-autotune) |
| Seed | 42 |
| Eval strategy | Every 1882 steps |
| Early stopping patience | 3 |
Hardware
- GPUs: 8x NVIDIA L40S
- Total training steps: 9,150 (3 epochs)
Training Progress (at checkpoint step 9,000)
- Training loss: 1.1880 (step 50) → 0.1628 (step 9000)
- Best eval loss: 0.0443 (step 7,528)
- Progress: 9,000 / 9,150 steps (98.4%)
Evaluation Results
| Step | Epoch | Eval Loss |
|---|---|---|
| 0 | 0.00 | 0.5361 |
| 1,882 | 0.10 | 0.0781 |
| 3,764 | 0.20 | 0.0587 |
| 5,646 | 0.30 | 0.0501 |
| 7,528 | 0.40 | 0.0443 |
Full training loss history (click to expand)
| Step | Epoch | Loss | Learning Rate |
|---|---|---|---|
| 50 | 0.0027 | 1.1880 | 1.07e-06 |
| 100 | 0.0053 | 0.7126 | 2.16e-06 |
| 150 | 0.0080 | 0.5812 | 3.26e-06 |
| 200 | 0.0106 | 0.5347 | 4.35e-06 |
| 250 | 0.0133 | 0.5104 | 5.44e-06 |
| 300 | 0.0159 | 0.4721 | 6.54e-06 |
| 350 | 0.0186 | 0.4572 | 7.63e-06 |
| 400 | 0.0213 | 0.4318 | 8.72e-06 |
| 450 | 0.0239 | 0.4221 | 9.81e-06 |
| 500 | 0.0266 | 0.4056 | 1.09e-05 |
| 550 | 0.0292 | 0.4013 | 1.20e-05 |
| 600 | 0.0319 | 0.3933 | 1.31e-05 |
| 650 | 0.0345 | 0.3729 | 1.42e-05 |
| 700 | 0.0372 | 0.4061 | 1.53e-05 |
| 750 | 0.0398 | 0.3672 | 1.64e-05 |
| 800 | 0.0425 | 0.3818 | 1.75e-05 |
| 850 | 0.0452 | 0.3498 | 1.86e-05 |
| 900 | 0.0478 | 0.3665 | 1.97e-05 |
| 950 | 0.0505 | 0.3628 | 1.99e-05 |
| 1,000 | 0.0531 | 0.3575 | 1.98e-05 |
| 1,050 | 0.0558 | 0.3550 | 1.97e-05 |
| 1,100 | 0.0584 | 0.3419 | 1.96e-05 |
| 1,150 | 0.0611 | 0.3471 | 1.94e-05 |
| 1,200 | 0.0638 | 0.3403 | 1.93e-05 |
| 1,250 | 0.0664 | 0.3309 | 1.92e-05 |
| 1,300 | 0.0691 | 0.3252 | 1.91e-05 |
| 1,350 | 0.0717 | 0.3187 | 1.89e-05 |
| 1,400 | 0.0744 | 0.3237 | 1.88e-05 |
| 1,450 | 0.0770 | 0.3239 | 1.87e-05 |
| 1,500 | 0.0797 | 0.3205 | 1.86e-05 |
| 1,550 | 0.0824 | 0.3141 | 1.85e-05 |
| 1,600 | 0.0850 | 0.3140 | 1.83e-05 |
| 1,650 | 0.0877 | 0.2961 | 1.82e-05 |
| 1,700 | 0.0903 | 0.3050 | 1.81e-05 |
| 1,750 | 0.0930 | 0.2959 | 1.80e-05 |
| 1,800 | 0.0956 | 0.3067 | 1.79e-05 |
| 1,850 | 0.0983 | 0.3060 | 1.77e-05 |
| 1,900 | 0.1009 | 0.2951 | 1.76e-05 |
| 1,950 | 0.1036 | 0.2862 | 1.75e-05 |
| 2,000 | 0.1063 | 0.2986 | 1.74e-05 |
| 2,050 | 0.1089 | 0.2807 | 1.72e-05 |
| 2,100 | 0.1116 | 0.2778 | 1.71e-05 |
| 2,150 | 0.1142 | 0.2857 | 1.70e-05 |
| 2,200 | 0.1169 | 0.2862 | 1.69e-05 |
| 2,250 | 0.1195 | 0.2651 | 1.68e-05 |
| 2,300 | 0.1222 | 0.2802 | 1.66e-05 |
| 2,350 | 0.1249 | 0.2834 | 1.65e-05 |
| 2,400 | 0.1275 | 0.2733 | 1.64e-05 |
| 2,450 | 0.1302 | 0.2607 | 1.63e-05 |
| 2,500 | 0.1328 | 0.2642 | 1.62e-05 |
| 2,550 | 0.1355 | 0.2713 | 1.60e-05 |
| 2,600 | 0.1381 | 0.2653 | 1.59e-05 |
| 2,650 | 0.1408 | 0.2583 | 1.58e-05 |
| 2,700 | 0.1434 | 0.2447 | 1.57e-05 |
| 2,750 | 0.1461 | 0.2550 | 1.55e-05 |
| 2,800 | 0.1488 | 0.2506 | 1.54e-05 |
| 2,850 | 0.1514 | 0.2539 | 1.53e-05 |
| 2,900 | 0.1541 | 0.2589 | 1.52e-05 |
| 2,950 | 0.1567 | 0.2504 | 1.51e-05 |
| 3,000 | 0.1594 | 0.2384 | 1.49e-05 |
| 3,050 | 0.1620 | 0.2525 | 1.48e-05 |
| 3,100 | 0.1647 | 0.2427 | 1.47e-05 |
| 3,150 | 0.1674 | 0.2365 | 1.46e-05 |
| 3,200 | 0.1700 | 0.2447 | 1.45e-05 |
| 3,250 | 0.1727 | 0.2423 | 1.43e-05 |
| 3,300 | 0.1753 | 0.2410 | 1.42e-05 |
| 3,350 | 0.1780 | 0.2443 | 1.41e-05 |
| 3,400 | 0.1806 | 0.2329 | 1.40e-05 |
| 3,450 | 0.1833 | 0.2380 | 1.38e-05 |
| 3,500 | 0.1860 | 0.2331 | 1.37e-05 |
| 3,550 | 0.1886 | 0.2446 | 1.36e-05 |
| 3,600 | 0.1913 | 0.2290 | 1.35e-05 |
| 3,650 | 0.1939 | 0.2303 | 1.34e-05 |
| 3,700 | 0.1966 | 0.2216 | 1.32e-05 |
| 3,750 | 0.1992 | 0.2321 | 1.31e-05 |
| 3,800 | 0.2019 | 0.2365 | 1.30e-05 |
| 3,850 | 0.2045 | 0.2365 | 1.29e-05 |
| 3,900 | 0.2072 | 0.2277 | 1.28e-05 |
| 3,950 | 0.2099 | 0.2262 | 1.26e-05 |
| 4,000 | 0.2125 | 0.2304 | 1.25e-05 |
| 4,050 | 0.2152 | 0.2228 | 1.24e-05 |
| 4,100 | 0.2178 | 0.2232 | 1.23e-05 |
| 4,150 | 0.2205 | 0.2239 | 1.21e-05 |
| 4,200 | 0.2231 | 0.2246 | 1.20e-05 |
| 4,250 | 0.2258 | 0.2103 | 1.19e-05 |
| 4,300 | 0.2285 | 0.2240 | 1.18e-05 |
| 4,350 | 0.2311 | 0.2183 | 1.17e-05 |
| 4,400 | 0.2338 | 0.2195 | 1.15e-05 |
| 4,450 | 0.2364 | 0.2205 | 1.14e-05 |
| 4,500 | 0.2391 | 0.2196 | 1.13e-05 |
| 4,550 | 0.2417 | 0.2165 | 1.12e-05 |
| 4,600 | 0.2444 | 0.2107 | 1.11e-05 |
| 4,650 | 0.2471 | 0.2142 | 1.09e-05 |
| 4,700 | 0.2497 | 0.2187 | 1.08e-05 |
| 4,750 | 0.2524 | 0.2121 | 1.07e-05 |
| 4,800 | 0.2550 | 0.2088 | 1.06e-05 |
| 4,850 | 0.2577 | 0.2139 | 1.04e-05 |
| 4,900 | 0.2603 | 0.2142 | 1.03e-05 |
| 4,950 | 0.2630 | 0.2149 | 1.02e-05 |
| 5,000 | 0.2656 | 0.2026 | 1.01e-05 |
| 5,050 | 0.2683 | 0.2097 | 9.96e-06 |
| 5,100 | 0.2710 | 0.2111 | 9.84e-06 |
| 5,150 | 0.2736 | 0.2116 | 9.72e-06 |
| 5,200 | 0.2763 | 0.2063 | 9.60e-06 |
| 5,250 | 0.2789 | 0.2071 | 9.47e-06 |
| 5,300 | 0.2816 | 0.2001 | 9.35e-06 |
| 5,350 | 0.2842 | 0.2002 | 9.23e-06 |
| 5,400 | 0.2869 | 0.2020 | 9.11e-06 |
| 5,450 | 0.2896 | 0.2043 | 8.99e-06 |
| 5,500 | 0.2922 | 0.1896 | 8.87e-06 |
| 5,550 | 0.2949 | 0.2020 | 8.75e-06 |
| 5,600 | 0.2975 | 0.1940 | 8.62e-06 |
| 5,650 | 0.3002 | 0.2000 | 8.50e-06 |
| 5,700 | 0.3028 | 0.1937 | 8.38e-06 |
| 5,750 | 0.3055 | 0.2105 | 8.26e-06 |
| 5,800 | 0.3082 | 0.1944 | 8.14e-06 |
| 5,850 | 0.3108 | 0.1885 | 8.02e-06 |
| 5,900 | 0.3135 | 0.2005 | 7.90e-06 |
| 5,950 | 0.3161 | 0.1883 | 7.77e-06 |
| 6,000 | 0.3188 | 0.1970 | 7.65e-06 |
| 6,050 | 0.3214 | 0.1857 | 7.53e-06 |
| 6,100 | 0.3241 | 0.1858 | 7.41e-06 |
| 6,150 | 0.3267 | 0.1809 | 7.29e-06 |
| 6,200 | 0.3294 | 0.1891 | 7.17e-06 |
| 6,250 | 0.3321 | 0.1834 | 7.05e-06 |
| 6,300 | 0.3347 | 0.1865 | 6.92e-06 |
| 6,350 | 0.3374 | 0.1938 | 6.80e-06 |
| 6,400 | 0.3400 | 0.1995 | 6.68e-06 |
| 6,450 | 0.3427 | 0.1938 | 6.56e-06 |
| 6,500 | 0.3453 | 0.1980 | 6.44e-06 |
| 6,550 | 0.3480 | 0.1833 | 6.32e-06 |
| 6,600 | 0.3507 | 0.1889 | 6.20e-06 |
| 6,650 | 0.3533 | 0.1881 | 6.07e-06 |
| 6,700 | 0.3560 | 0.1923 | 5.95e-06 |
| 6,750 | 0.3586 | 0.1935 | 5.83e-06 |
| 6,800 | 0.3613 | 0.1874 | 5.71e-06 |
| 6,850 | 0.3639 | 0.1807 | 5.59e-06 |
| 6,900 | 0.3666 | 0.1848 | 5.47e-06 |
| 6,950 | 0.3692 | 0.1841 | 5.35e-06 |
| 7,000 | 0.3719 | 0.1656 | 5.22e-06 |
| 7,050 | 0.3746 | 0.1861 | 5.10e-06 |
| 7,100 | 0.3772 | 0.1915 | 4.98e-06 |
| 7,150 | 0.3799 | 0.1858 | 4.86e-06 |
| 7,200 | 0.3825 | 0.1856 | 4.74e-06 |
| 7,250 | 0.3852 | 0.1753 | 4.62e-06 |
| 7,300 | 0.3878 | 0.1768 | 4.50e-06 |
| 7,350 | 0.3905 | 0.1765 | 4.37e-06 |
| 7,400 | 0.3932 | 0.1839 | 4.25e-06 |
| 7,450 | 0.3958 | 0.1813 | 4.13e-06 |
| 7,500 | 0.3985 | 0.1881 | 4.01e-06 |
| 7,550 | 0.4011 | 0.1776 | 3.89e-06 |
| 7,600 | 0.4038 | 0.1764 | 3.77e-06 |
| 7,650 | 0.4064 | 0.1765 | 3.65e-06 |
| 7,700 | 0.4091 | 0.1856 | 3.52e-06 |
| 7,750 | 0.4118 | 0.1761 | 3.40e-06 |
| 7,800 | 0.4144 | 0.1742 | 3.28e-06 |
| 7,850 | 0.4171 | 0.1735 | 3.16e-06 |
| 7,900 | 0.4197 | 0.1796 | 3.04e-06 |
| 7,950 | 0.4224 | 0.1808 | 2.92e-06 |
| 8,000 | 0.4250 | 0.1743 | 2.80e-06 |
| 8,050 | 0.4277 | 0.1709 | 2.67e-06 |
| 8,100 | 0.4303 | 0.1642 | 2.55e-06 |
| 8,150 | 0.4330 | 0.1690 | 2.43e-06 |
| 8,200 | 0.4357 | 0.1793 | 2.31e-06 |
| 8,250 | 0.4383 | 0.1732 | 2.19e-06 |
| 8,300 | 0.4410 | 0.1689 | 2.07e-06 |
| 8,350 | 0.4436 | 0.1675 | 1.95e-06 |
| 8,400 | 0.4463 | 0.1619 | 1.82e-06 |
| 8,450 | 0.4489 | 0.1713 | 1.70e-06 |
| 8,500 | 0.4516 | 0.1735 | 1.58e-06 |
| 8,550 | 0.4543 | 0.1682 | 1.46e-06 |
| 8,600 | 0.4569 | 0.1717 | 1.34e-06 |
| 8,650 | 0.4596 | 0.1601 | 1.22e-06 |
| 8,700 | 0.4622 | 0.1722 | 1.10e-06 |
| 8,750 | 0.4649 | 0.1674 | 9.74e-07 |
| 8,800 | 0.4675 | 0.1704 | 8.52e-07 |
| 8,850 | 0.4702 | 0.1684 | 7.31e-07 |
| 8,900 | 0.4729 | 0.1645 | 6.10e-07 |
| 8,950 | 0.4755 | 0.1717 | 4.88e-07 |
| 9,000 | 0.4782 | 0.1628 | 3.67e-07 |
Usage
Loading the Model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("aysinghal/ide-code-retrieval-qwen3-0.6b-inbatch")
Computing Embeddings
queries = [
"fix null pointer exception in user authentication",
"add retry logic to API client",
]
code_docs = [
"def authenticate(user):\n if user is None:\n raise ValueError...",
"class APIClient:\n def request(self, url, retries=3):\n ...",
]
query_embeddings = model.encode(queries)
code_embeddings = model.encode(code_docs)
# Compute cosine similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(query_embeddings, code_embeddings)
print(similarities)
Intended Use
- Primary use case: Retrieving relevant code files/functions given a natural-language query (commit message, bug description, feature request)
- Search pipeline: Encode a corpus of code documents offline, then at query time encode the query and find nearest neighbors via cosine similarity
Limitations
- This is an early checkpoint (98.4% through training). The loss curve is still decreasing, so later checkpoints will likely perform better.
- Trained on a specific code retrieval dataset; may not generalize to all programming languages or query styles without further fine-tuning.
- Max context is 1024 tokens -- very long files are truncated.
Citation
If you use this model, please cite the base model:
@article{qwen3embedding,
title={Qwen3-Embedding},
author={Qwen Team},
year={2025}
}
- Downloads last month
- 228