Upload 15 files
Browse files- README.md +79 -93
- config.json +1 -1
- config_sentence_transformers.json +1 -1
- gitattributes +36 -0
README.md
CHANGED
|
@@ -8,53 +8,39 @@ tags:
|
|
| 8 |
- loss:MultipleNegativesRankingLoss
|
| 9 |
base_model: Qwen/Qwen3-0.6B-Base
|
| 10 |
widget:
|
| 11 |
-
- source_sentence:
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
When everybody was sleeping, two of them woke up and decided to divide the diamonds
|
| 15 |
-
equally among themselves. But when they divided the diamonds equally, one diamond
|
| 16 |
-
is left.
|
| 17 |
-
|
| 18 |
-
So they woke up the 3rd thief and tried to divide the diamonds equally again but
|
| 19 |
-
still one diamond was left. Then they woke up the 4th thief to divide the diamonds
|
| 20 |
-
equally again, and again one diamond was left. This happened with the 5th and
|
| 21 |
-
6th thief – one diamond was still left.
|
| 22 |
-
|
| 23 |
-
Finally, they woke up the 7th thief and this time the diamonds were divided equally.
|
| 24 |
-
|
| 25 |
-
How many diamonds did they steal in total?'
|
| 26 |
sentences:
|
| 27 |
- ''''
|
| 28 |
-
- ''''
|
| 29 |
-
- e
|
| 30 |
-
- source_sentence: 'praveen starts business with rs . 3220 and after 5 months , hari
|
| 31 |
-
joins with praveen as his partner . after a year , the profit is divided in the
|
| 32 |
-
ratio 2 : 3 . what is hari ’ s contribution in the capital ?'
|
| 33 |
-
sentences:
|
| 34 |
-
- s
|
| 35 |
-
- '5'
|
| 36 |
- '['
|
| 37 |
-
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
sentences:
|
| 41 |
-
-
|
| 42 |
-
-
|
| 43 |
-
-
|
| 44 |
-
- source_sentence:
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
is then poured into the cylinder such that it reaches the rim. What is the volume
|
| 48 |
-
of the liquid?
|
| 49 |
sentences:
|
| 50 |
-
-
|
| 51 |
-
- '
|
| 52 |
-
- '
|
| 53 |
-
- source_sentence:
|
|
|
|
| 54 |
sentences:
|
| 55 |
-
-
|
| 56 |
-
-
|
| 57 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
pipeline_tag: sentence-similarity
|
| 59 |
library_name: sentence-transformers
|
| 60 |
---
|
|
@@ -108,9 +94,9 @@ from sentence_transformers import SentenceTransformer
|
|
| 108 |
model = SentenceTransformer("sentence_transformers_model_id")
|
| 109 |
# Run inference
|
| 110 |
sentences = [
|
| 111 |
-
'
|
| 112 |
-
'
|
| 113 |
-
'
|
| 114 |
]
|
| 115 |
embeddings = model.encode(sentences)
|
| 116 |
print(embeddings.shape)
|
|
@@ -167,16 +153,16 @@ You can finetune this model on your own dataset.
|
|
| 167 |
* Size: 268,861 training samples
|
| 168 |
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
|
| 169 |
* Approximate statistics based on the first 1000 samples:
|
| 170 |
-
| | sentence_0
|
| 171 |
-
|
| 172 |
-
| type | string
|
| 173 |
-
| details | <ul><li>min:
|
| 174 |
* Samples:
|
| 175 |
-
| sentence_0
|
| 176 |
-
|
| 177 |
-
| <code>
|
| 178 |
-
| <code>
|
| 179 |
-
| <code>
|
| 180 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 181 |
```json
|
| 182 |
{
|
|
@@ -188,9 +174,9 @@ You can finetune this model on your own dataset.
|
|
| 188 |
### Training Hyperparameters
|
| 189 |
#### Non-Default Hyperparameters
|
| 190 |
|
| 191 |
-
- `per_device_train_batch_size`:
|
| 192 |
-
- `per_device_eval_batch_size`:
|
| 193 |
-
- `num_train_epochs`:
|
| 194 |
- `fp16`: True
|
| 195 |
- `multi_dataset_batch_sampler`: round_robin
|
| 196 |
|
|
@@ -201,8 +187,8 @@ You can finetune this model on your own dataset.
|
|
| 201 |
- `do_predict`: False
|
| 202 |
- `eval_strategy`: no
|
| 203 |
- `prediction_loss_only`: True
|
| 204 |
-
- `per_device_train_batch_size`:
|
| 205 |
-
- `per_device_eval_batch_size`:
|
| 206 |
- `per_gpu_train_batch_size`: None
|
| 207 |
- `per_gpu_eval_batch_size`: None
|
| 208 |
- `gradient_accumulation_steps`: 1
|
|
@@ -214,7 +200,7 @@ You can finetune this model on your own dataset.
|
|
| 214 |
- `adam_beta2`: 0.999
|
| 215 |
- `adam_epsilon`: 1e-08
|
| 216 |
- `max_grad_norm`: 1
|
| 217 |
-
- `num_train_epochs`:
|
| 218 |
- `max_steps`: -1
|
| 219 |
- `lr_scheduler_type`: linear
|
| 220 |
- `lr_scheduler_kwargs`: {}
|
|
@@ -316,45 +302,45 @@ You can finetune this model on your own dataset.
|
|
| 316 |
### Training Logs
|
| 317 |
| Epoch | Step | Training Loss |
|
| 318 |
|:------:|:-----:|:-------------:|
|
| 319 |
-
| 0.
|
| 320 |
-
| 0.
|
| 321 |
-
| 0.
|
| 322 |
-
| 0.
|
| 323 |
-
| 0.
|
| 324 |
-
| 0.
|
| 325 |
-
| 0.
|
| 326 |
-
| 0.
|
| 327 |
-
|
|
| 328 |
-
|
|
| 329 |
-
|
|
| 330 |
-
|
|
| 331 |
-
|
|
| 332 |
-
|
|
| 333 |
-
|
|
| 334 |
-
|
|
| 335 |
-
|
|
| 336 |
-
|
|
| 337 |
-
|
|
| 338 |
-
|
|
| 339 |
-
|
|
| 340 |
-
|
|
| 341 |
-
|
|
| 342 |
-
|
|
| 343 |
-
|
|
| 344 |
-
|
|
| 345 |
-
|
|
| 346 |
-
|
|
| 347 |
-
|
|
| 348 |
-
|
|
| 349 |
-
|
|
| 350 |
-
|
|
| 351 |
-
|
|
| 352 |
|
| 353 |
|
| 354 |
### Framework Versions
|
| 355 |
- Python: 3.11.13
|
| 356 |
- Sentence Transformers: 4.1.0
|
| 357 |
-
- Transformers: 4.52.
|
| 358 |
- PyTorch: 2.6.0+cu124
|
| 359 |
- Accelerate: 1.7.0
|
| 360 |
- Datasets: 3.6.0
|
|
|
|
| 8 |
- loss:MultipleNegativesRankingLoss
|
| 9 |
base_model: Qwen/Qwen3-0.6B-Base
|
| 10 |
widget:
|
| 11 |
+
- source_sentence: how many seconds will a 450 m long train take to cross a man walking
|
| 12 |
+
with a speed of 3 km / hr in the direction of the moving train if the speed of
|
| 13 |
+
the train is 63 km / hr ?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
sentences:
|
| 15 |
- ''''
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
- '['
|
| 17 |
+
- '2'
|
| 18 |
+
- source_sentence: 'A patient of CSOM has choleastatoma and presents with veigo .
|
| 19 |
+
Treatment of choice would be:'
|
| 20 |
sentences:
|
| 21 |
+
- A
|
| 22 |
+
- ''''
|
| 23 |
+
- ''''
|
| 24 |
+
- source_sentence: Dhoni spent 25 percent of his earning last month on rent and 10
|
| 25 |
+
percent less than what he spent on rent to purchase a new dishwasher. What percent
|
| 26 |
+
of last month's earning did Dhoni have left over?
|
|
|
|
|
|
|
| 27 |
sentences:
|
| 28 |
+
- C
|
| 29 |
+
- ''''
|
| 30 |
+
- '%'
|
| 31 |
+
- source_sentence: 'On the xy co-ordinate plane, point C is (5,-2) and point D is
|
| 32 |
+
(-1,2). The point on line segment CD that is twice as far from C as from D is:'
|
| 33 |
sentences:
|
| 34 |
+
- '1'
|
| 35 |
+
- n
|
| 36 |
+
- y
|
| 37 |
+
- source_sentence: car a runs at the speed of 35 km / hr & reaches its destination
|
| 38 |
+
in 9 hr . car b runs at the speed of 43 km / h & reaches its destination in 10
|
| 39 |
+
h . what is the respective ratio of distances covered by car a & car b ?
|
| 40 |
+
sentences:
|
| 41 |
+
- ' '
|
| 42 |
+
- R
|
| 43 |
+
- ''''
|
| 44 |
pipeline_tag: sentence-similarity
|
| 45 |
library_name: sentence-transformers
|
| 46 |
---
|
|
|
|
| 94 |
model = SentenceTransformer("sentence_transformers_model_id")
|
| 95 |
# Run inference
|
| 96 |
sentences = [
|
| 97 |
+
'car a runs at the speed of 35 km / hr & reaches its destination in 9 hr . car b runs at the speed of 43 km / h & reaches its destination in 10 h . what is the respective ratio of distances covered by car a & car b ?',
|
| 98 |
+
' ',
|
| 99 |
+
"'",
|
| 100 |
]
|
| 101 |
embeddings = model.encode(sentences)
|
| 102 |
print(embeddings.shape)
|
|
|
|
| 153 |
* Size: 268,861 training samples
|
| 154 |
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
|
| 155 |
* Approximate statistics based on the first 1000 samples:
|
| 156 |
+
| | sentence_0 | sentence_1 |
|
| 157 |
+
|:--------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|
|
| 158 |
+
| type | string | string |
|
| 159 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 48.06 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 0 tokens</li><li>mean: 0.98 tokens</li><li>max: 1 tokens</li></ul> |
|
| 160 |
* Samples:
|
| 161 |
+
| sentence_0 | sentence_1 |
|
| 162 |
+
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
| 163 |
+
| <code>What is known to cause pedal Botryomycosis</code> | <code>A</code> |
|
| 164 |
+
| <code>Two friends plan to walk along a 33-km trail, starting at opposite ends of the trail at the same time. If Friend P's rate is 20% faster than Friend Q's, how many kilometers will Friend P have walked when they pass each other?</code> | <code>5</code> |
|
| 165 |
+
| <code>The average age of a husband and a wife is 23 years when they were married five years ago but now the average age of the husband, wife and child is 20 years(the child was born during the interval). What is the present age of the child?</code> | <code>)</code> |
|
| 166 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 167 |
```json
|
| 168 |
{
|
|
|
|
| 174 |
### Training Hyperparameters
|
| 175 |
#### Non-Default Hyperparameters
|
| 176 |
|
| 177 |
+
- `per_device_train_batch_size`: 16
|
| 178 |
+
- `per_device_eval_batch_size`: 16
|
| 179 |
+
- `num_train_epochs`: 1
|
| 180 |
- `fp16`: True
|
| 181 |
- `multi_dataset_batch_sampler`: round_robin
|
| 182 |
|
|
|
|
| 187 |
- `do_predict`: False
|
| 188 |
- `eval_strategy`: no
|
| 189 |
- `prediction_loss_only`: True
|
| 190 |
+
- `per_device_train_batch_size`: 16
|
| 191 |
+
- `per_device_eval_batch_size`: 16
|
| 192 |
- `per_gpu_train_batch_size`: None
|
| 193 |
- `per_gpu_eval_batch_size`: None
|
| 194 |
- `gradient_accumulation_steps`: 1
|
|
|
|
| 200 |
- `adam_beta2`: 0.999
|
| 201 |
- `adam_epsilon`: 1e-08
|
| 202 |
- `max_grad_norm`: 1
|
| 203 |
+
- `num_train_epochs`: 1
|
| 204 |
- `max_steps`: -1
|
| 205 |
- `lr_scheduler_type`: linear
|
| 206 |
- `lr_scheduler_kwargs`: {}
|
|
|
|
| 302 |
### Training Logs
|
| 303 |
| Epoch | Step | Training Loss |
|
| 304 |
|:------:|:-----:|:-------------:|
|
| 305 |
+
| 0.0298 | 500 | 2.7788 |
|
| 306 |
+
| 0.0595 | 1000 | 2.5217 |
|
| 307 |
+
| 0.0893 | 1500 | 2.5004 |
|
| 308 |
+
| 0.1190 | 2000 | 2.5451 |
|
| 309 |
+
| 0.1488 | 2500 | 2.5165 |
|
| 310 |
+
| 0.1785 | 3000 | 2.5384 |
|
| 311 |
+
| 0.2083 | 3500 | 2.4994 |
|
| 312 |
+
| 0.2380 | 4000 | 0.0 |
|
| 313 |
+
| 0.2678 | 4500 | 0.0 |
|
| 314 |
+
| 0.2975 | 5000 | 0.0 |
|
| 315 |
+
| 0.3273 | 5500 | 0.0 |
|
| 316 |
+
| 0.3571 | 6000 | 0.0 |
|
| 317 |
+
| 0.3868 | 6500 | 0.0 |
|
| 318 |
+
| 0.4166 | 7000 | 0.0 |
|
| 319 |
+
| 0.4463 | 7500 | 0.0 |
|
| 320 |
+
| 0.4761 | 8000 | 0.0 |
|
| 321 |
+
| 0.5058 | 8500 | 0.0 |
|
| 322 |
+
| 0.5356 | 9000 | 0.0 |
|
| 323 |
+
| 0.5653 | 9500 | 0.0 |
|
| 324 |
+
| 0.5951 | 10000 | 0.0 |
|
| 325 |
+
| 0.6249 | 10500 | 0.0 |
|
| 326 |
+
| 0.6546 | 11000 | 0.0 |
|
| 327 |
+
| 0.6844 | 11500 | 0.0 |
|
| 328 |
+
| 0.7141 | 12000 | 0.0 |
|
| 329 |
+
| 0.7439 | 12500 | 0.0 |
|
| 330 |
+
| 0.7736 | 13000 | 0.0 |
|
| 331 |
+
| 0.8034 | 13500 | 0.0 |
|
| 332 |
+
| 0.8331 | 14000 | 0.0 |
|
| 333 |
+
| 0.8629 | 14500 | 0.0 |
|
| 334 |
+
| 0.8926 | 15000 | 0.0 |
|
| 335 |
+
| 0.9224 | 15500 | 0.0 |
|
| 336 |
+
| 0.9522 | 16000 | 0.0 |
|
| 337 |
+
| 0.9819 | 16500 | 0.0 |
|
| 338 |
|
| 339 |
|
| 340 |
### Framework Versions
|
| 341 |
- Python: 3.11.13
|
| 342 |
- Sentence Transformers: 4.1.0
|
| 343 |
+
- Transformers: 4.52.3
|
| 344 |
- PyTorch: 2.6.0+cu124
|
| 345 |
- Accelerate: 1.7.0
|
| 346 |
- Datasets: 3.6.0
|
config.json
CHANGED
|
@@ -23,7 +23,7 @@
|
|
| 23 |
"sliding_window": null,
|
| 24 |
"tie_word_embeddings": true,
|
| 25 |
"torch_dtype": "float32",
|
| 26 |
-
"transformers_version": "4.52.
|
| 27 |
"use_cache": true,
|
| 28 |
"use_sliding_window": false,
|
| 29 |
"vocab_size": 151936
|
|
|
|
| 23 |
"sliding_window": null,
|
| 24 |
"tie_word_embeddings": true,
|
| 25 |
"torch_dtype": "float32",
|
| 26 |
+
"transformers_version": "4.52.3",
|
| 27 |
"use_cache": true,
|
| 28 |
"use_sliding_window": false,
|
| 29 |
"vocab_size": 151936
|
config_sentence_transformers.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"__version__": {
|
| 3 |
"sentence_transformers": "4.1.0",
|
| 4 |
-
"transformers": "4.52.
|
| 5 |
"pytorch": "2.6.0+cu124"
|
| 6 |
},
|
| 7 |
"prompts": {},
|
|
|
|
| 1 |
{
|
| 2 |
"__version__": {
|
| 3 |
"sentence_transformers": "4.1.0",
|
| 4 |
+
"transformers": "4.52.3",
|
| 5 |
"pytorch": "2.6.0+cu124"
|
| 6 |
},
|
| 7 |
"prompts": {},
|
gitattributes
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|