SentenceTransformer

This is a sentence-transformers model trained on the generator dataset. It maps sentences & paragraphs to a 4096-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 32768 tokens
  • Output Dimensionality: 4096 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • generator

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 32768, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("reasonwang/embedding-qwen3-1.7b-embedding_ctxt_ac1_unicode_shuf")
# Run inference
sentences = [
    'The weather is lovely today.',
    "It's so sunny outside!",
    'He drove to the stadium.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 4096]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8605, 0.8626],
#         [0.8605, 1.0000, 0.7646],
#         [0.8626, 0.7646, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.62
cosine_accuracy@3 0.8
cosine_accuracy@5 0.85
cosine_accuracy@10 0.92
cosine_precision@1 0.62
cosine_precision@3 0.5
cosine_precision@5 0.436
cosine_precision@10 0.336
cosine_recall@1 0.1222
cosine_recall@3 0.2575
cosine_recall@5 0.3172
cosine_recall@10 0.4205
cosine_ndcg@10 0.5236
cosine_ndcg@100 0.579
cosine_mrr@10 0.7175
cosine_mrr@100 0.7209
cosine_map@100 0.3916

Training Details

Training Dataset

generator

  • Dataset: generator
  • Columns: sentence1 and sentence2
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 4,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • learning_rate: 2e-05
  • max_steps: 100000
  • log_level: info
  • bf16: True
  • dataloader_num_workers: 1
  • accelerator_config: {'split_batches': False, 'dispatch_batches': False, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 100000
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: info
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 1
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': False, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss validation_retrieval_cosine_ndcg@100
1e-05 1 5.3619 -
0.0001 10 3.792 -
0.0002 20 2.7885 -
0.0003 30 2.4762 -
0.0004 40 2.3427 -
0.0005 50 2.2636 -
0.0006 60 2.2051 -
0.0007 70 2.1559 -
0.0008 80 2.1319 -
0.0009 90 2.1058 -
0.001 100 2.0827 0.5286
0.0011 110 2.0716 -
0.0012 120 2.0793 -
0.0013 130 2.0351 -
0.0014 140 2.0021 -
0.0015 150 2.0188 -
0.0016 160 2.004 -
0.0017 170 2.0122 -
0.0018 180 2.001 -
0.0019 190 1.9768 -
0.002 200 1.9975 0.5425
0.0021 210 1.9717 -
0.0022 220 1.9549 -
0.0023 230 1.9626 -
0.0024 240 1.927 -
0.0025 250 1.9438 -
0.0026 260 1.9322 -
0.0027 270 1.9176 -
0.0028 280 1.9225 -
0.0029 290 1.9225 -
0.003 300 1.8604 0.5460
0.0031 310 1.9088 -
0.0032 320 1.8923 -
0.0033 330 1.8864 -
0.0034 340 1.8888 -
0.0035 350 1.9083 -
0.0036 360 1.8757 -
0.0037 370 1.8743 -
0.0038 380 1.8348 -
0.0039 390 1.8616 -
0.004 400 1.865 0.5477
0.0041 410 1.8715 -
0.0042 420 1.8706 -
0.0043 430 1.853 -
0.0044 440 1.8556 -
0.0045 450 1.825 -
0.0046 460 1.8544 -
0.0047 470 1.8383 -
0.0048 480 1.839 -
0.0049 490 1.8289 -
0.005 500 1.8222 0.5500
0.0051 510 1.8373 -
0.0052 520 1.8183 -
0.0053 530 1.8254 -
0.0054 540 1.7999 -
0.0055 550 1.8108 -
0.0056 560 1.8101 -
0.0057 570 1.8243 -
0.0058 580 1.8348 -
0.0059 590 1.7991 -
0.006 600 1.7865 0.5617
0.0061 610 1.8015 -
0.0062 620 1.7859 -
0.0063 630 1.7863 -
0.0064 640 1.7965 -
0.0065 650 1.8217 -
0.0066 660 1.7747 -
0.0067 670 1.802 -
0.0068 680 1.7932 -
0.0069 690 1.7734 -
0.007 700 1.776 0.5635
0.0071 710 1.7798 -
0.0072 720 1.7702 -
0.0073 730 1.7753 -
0.0074 740 1.7488 -
0.0075 750 1.7676 -
0.0076 760 1.7855 -
0.0077 770 1.7672 -
0.0078 780 1.7847 -
0.0079 790 1.7669 -
0.008 800 1.7564 0.5608
0.0081 810 1.7619 -
0.0082 820 1.7366 -
0.0083 830 1.7396 -
0.0084 840 1.751 -
0.0085 850 1.7604 -
0.0086 860 1.768 -
0.0087 870 1.7424 -
0.0088 880 1.7533 -
0.0089 890 1.7156 -
0.009 900 1.7449 0.5620
0.0091 910 1.7198 -
0.0092 920 1.734 -
0.0093 930 1.729 -
0.0094 940 1.7162 -
0.0095 950 1.7283 -
0.0096 960 1.7101 -
0.0097 970 1.7252 -
0.0098 980 1.7391 -
0.0099 990 1.7371 -
0.01 1000 1.7161 0.5685
0.0101 1010 1.7148 -
0.0102 1020 1.7314 -
0.0103 1030 1.7352 -
0.0104 1040 1.7156 -
0.0105 1050 1.7308 -
0.0106 1060 1.7348 -
0.0107 1070 1.7334 -
0.0108 1080 1.7177 -
0.0109 1090 1.7029 -
0.011 1100 1.7357 0.5819
0.0111 1110 1.7081 -
0.0112 1120 1.6928 -
0.0113 1130 1.7355 -
0.0114 1140 1.7071 -
0.0115 1150 1.706 -
0.0116 1160 1.7074 -
0.0117 1170 1.6842 -
0.0118 1180 1.6944 -
0.0119 1190 1.7054 -
0.012 1200 1.7458 0.5704
0.0121 1210 1.7028 -
0.0122 1220 1.7026 -
0.0123 1230 1.7061 -
0.0124 1240 1.6791 -
0.0125 1250 1.6996 -
0.0126 1260 1.7036 -
0.0127 1270 1.7152 -
0.0128 1280 1.7158 -
0.0129 1290 1.6871 -
0.013 1300 1.6842 0.5736
0.0131 1310 1.6887 -
0.0132 1320 1.6747 -
0.0133 1330 1.6982 -
0.0134 1340 1.6845 -
0.0135 1350 1.6801 -
0.0136 1360 1.6926 -
0.0137 1370 1.6747 -
0.0138 1380 1.6719 -
0.0139 1390 1.6839 -
0.014 1400 1.6722 0.5754
0.0141 1410 1.6779 -
0.0142 1420 1.6827 -
0.0143 1430 1.6728 -
0.0144 1440 1.6871 -
0.0145 1450 1.6958 -
0.0146 1460 1.6735 -
0.0147 1470 1.6654 -
0.0148 1480 1.6831 -
0.0149 1490 1.6811 -
0.015 1500 1.673 0.5725
0.0151 1510 1.6589 -
0.0152 1520 1.6465 -
0.0153 1530 1.6808 -
0.0154 1540 1.6528 -
0.0155 1550 1.6884 -
0.0156 1560 1.6496 -
0.0157 1570 1.6469 -
0.0158 1580 1.6774 -
0.0159 1590 1.6869 -
0.016 1600 1.6426 0.5710
0.0161 1610 1.6656 -
0.0162 1620 1.6735 -
0.0163 1630 1.6886 -
0.0164 1640 1.6664 -
0.0165 1650 1.6578 -
0.0166 1660 1.6656 -
0.0167 1670 1.6729 -
0.0168 1680 1.665 -
0.0169 1690 1.6612 -
0.017 1700 1.6599 0.5691
0.0171 1710 1.637 -
0.0172 1720 1.6735 -
0.0173 1730 1.6442 -
0.0174 1740 1.6627 -
0.0175 1750 1.6752 -
0.0176 1760 1.6622 -
0.0177 1770 1.6472 -
0.0178 1780 1.6693 -
0.0179 1790 1.6684 -
0.018 1800 1.6578 0.5684
0.0181 1810 1.6298 -
0.0182 1820 1.6254 -
0.0183 1830 1.6306 -
0.0184 1840 1.6541 -
0.0185 1850 1.6412 -
0.0186 1860 1.6555 -
0.0187 1870 1.645 -
0.0188 1880 1.6459 -
0.0189 1890 1.6397 -
0.019 1900 1.6616 0.5806
0.0191 1910 1.6332 -
0.0192 1920 1.6484 -
0.0193 1930 1.6369 -
0.0194 1940 1.6319 -
0.0195 1950 1.637 -
0.0196 1960 1.6462 -
0.0197 1970 1.6294 -
0.0198 1980 1.6569 -
0.0199 1990 1.6217 -
0.02 2000 1.6364 0.5716
0.0201 2010 1.6423 -
0.0202 2020 1.6516 -
0.0203 2030 1.6221 -
0.0204 2040 1.6516 -
0.0205 2050 1.6496 -
0.0206 2060 1.6287 -
0.0207 2070 1.6076 -
0.0208 2080 1.6422 -
0.0209 2090 1.5888 -
0.021 2100 1.6276 0.5798
0.0211 2110 1.6335 -
0.0212 2120 1.641 -
0.0213 2130 1.6342 -
0.0214 2140 1.6394 -
0.0215 2150 1.628 -
0.0216 2160 1.6099 -
0.0217 2170 1.6256 -
0.0218 2180 1.6165 -
0.0219 2190 1.6209 -
0.022 2200 1.624 0.5790
0.0221 2210 1.6192 -
0.0222 2220 1.6196 -
0.0223 2230 1.6199 -
0.0224 2240 1.6233 -
0.0225 2250 1.6083 -
0.0226 2260 1.6282 -
0.0227 2270 1.6336 -
0.0228 2280 1.6221 -
0.0229 2290 1.6216 -
0.023 2300 1.5997 0.5803
0.0231 2310 1.6048 -
0.0232 2320 1.6129 -
0.0233 2330 1.616 -
0.0234 2340 1.6106 -
0.0235 2350 1.6201 -
0.0236 2360 1.6092 -
0.0237 2370 1.6109 -
0.0238 2380 1.6062 -
0.0239 2390 1.622 -
0.024 2400 1.616 0.5737
0.0241 2410 1.6261 -
0.0242 2420 1.5979 -
0.0243 2430 1.5988 -
0.0244 2440 1.5813 -
0.0245 2450 1.6147 -
0.0246 2460 1.6204 -
0.0247 2470 1.5962 -
0.0248 2480 1.6243 -
0.0249 2490 1.6151 -
0.025 2500 1.6171 0.5817
0.0251 2510 1.5873 -
0.0252 2520 1.5772 -
0.0253 2530 1.6069 -
0.0254 2540 1.6086 -
0.0255 2550 1.6028 -
0.0256 2560 1.6031 -
0.0257 2570 1.5907 -
0.0258 2580 1.6073 -
0.0259 2590 1.6105 -
0.026 2600 1.608 0.5771
0.0261 2610 1.6048 -
0.0262 2620 1.6119 -
0.0263 2630 1.5895 -
0.0264 2640 1.6335 -
0.0265 2650 1.5815 -
0.0266 2660 1.6071 -
0.0267 2670 1.6094 -
0.0268 2680 1.6241 -
0.0269 2690 1.6222 -
0.027 2700 1.597 0.5758
0.0271 2710 1.6102 -
0.0272 2720 1.582 -
0.0273 2730 1.5816 -
0.0274 2740 1.5896 -
0.0275 2750 1.5892 -
0.0276 2760 1.5979 -
0.0277 2770 1.6292 -
0.0278 2780 1.5712 -
0.0279 2790 1.5879 -
0.028 2800 1.6054 0.5767
0.0281 2810 1.5794 -
0.0282 2820 1.5901 -
0.0283 2830 1.5991 -
0.0284 2840 1.5806 -
0.0285 2850 1.6019 -
0.0286 2860 1.6132 -
0.0287 2870 1.5989 -
0.0288 2880 1.5878 -
0.0289 2890 1.5863 -
0.029 2900 1.6141 0.5810
0.0291 2910 1.5756 -
0.0292 2920 1.5885 -
0.0293 2930 1.5756 -
0.0294 2940 1.5793 -
0.0295 2950 1.5685 -
0.0296 2960 1.5996 -
0.0297 2970 1.5893 -
0.0298 2980 1.5791 -
0.0299 2990 1.5842 -
0.03 3000 1.5892 0.5810
0.0301 3010 1.5705 -
0.0302 3020 1.5865 -
0.0303 3030 1.5849 -
0.0304 3040 1.5873 -
0.0305 3050 1.5921 -
0.0306 3060 1.5991 -
0.0307 3070 1.5886 -
0.0308 3080 1.5752 -
0.0309 3090 1.5846 -
0.031 3100 1.5873 0.5766
0.0311 3110 1.5704 -
0.0312 3120 1.5963 -
0.0313 3130 1.5829 -
0.0314 3140 1.6 -
0.0315 3150 1.5741 -
0.0316 3160 1.5749 -
0.0317 3170 1.5736 -
0.0318 3180 1.5531 -
0.0319 3190 1.5881 -
0.032 3200 1.5873 0.5802
0.0321 3210 1.5842 -
0.0322 3220 1.5745 -
0.0323 3230 1.5668 -
0.0324 3240 1.5633 -
0.0325 3250 1.5696 -
0.0326 3260 1.5627 -
0.0327 3270 1.5761 -
0.0328 3280 1.5736 -
0.0329 3290 1.5668 -
0.033 3300 1.5575 0.5774
0.0331 3310 1.5599 -
0.0332 3320 1.5706 -
0.0333 3330 1.5845 -
0.0334 3340 1.5715 -
0.0335 3350 1.5809 -
0.0336 3360 1.5526 -
0.0337 3370 1.5568 -
0.0338 3380 1.5902 -
0.0339 3390 1.5789 -
0.034 3400 1.5848 0.5745
0.0341 3410 1.5772 -
0.0342 3420 1.5785 -
0.0343 3430 1.562 -
0.0344 3440 1.571 -
0.0345 3450 1.5546 -
0.0346 3460 1.5571 -
0.0347 3470 1.5519 -
0.0348 3480 1.5436 -
0.0349 3490 1.5607 -
0.035 3500 1.5546 0.5875
0.0351 3510 1.5716 -
0.0352 3520 1.5693 -
0.0353 3530 1.5597 -
0.0354 3540 1.599 -
0.0355 3550 1.5628 -
0.0356 3560 1.5615 -
0.0357 3570 1.5648 -
0.0358 3580 1.5563 -
0.0359 3590 1.5568 -
0.036 3600 1.5624 0.5863
0.0361 3610 1.5521 -
0.0362 3620 1.5528 -
0.0363 3630 1.5726 -
0.0364 3640 1.5666 -
0.0365 3650 1.5435 -
0.0366 3660 1.5654 -
0.0367 3670 1.5702 -
0.0368 3680 1.5627 -
0.0369 3690 1.5513 -
0.037 3700 1.5489 0.5826
0.0371 3710 1.5779 -
0.0372 3720 1.5654 -
0.0373 3730 1.5667 -
0.0374 3740 1.5577 -
0.0375 3750 1.5569 -
0.0376 3760 1.5466 -
0.0377 3770 1.5607 -
0.0378 3780 1.5619 -
0.0379 3790 1.5558 -
0.038 3800 1.5461 0.5805
0.0381 3810 1.5294 -
0.0382 3820 1.5604 -
0.0383 3830 1.559 -
0.0384 3840 1.5574 -
0.0385 3850 1.5621 -
0.0386 3860 1.5592 -
0.0387 3870 1.5425 -
0.0388 3880 1.5746 -
0.0389 3890 1.5508 -
0.039 3900 1.5563 0.5808
0.0391 3910 1.5515 -
0.0392 3920 1.5714 -
0.0393 3930 1.5479 -
0.0394 3940 1.556 -
0.0395 3950 1.5357 -
0.0396 3960 1.5375 -
0.0397 3970 1.5596 -
0.0398 3980 1.5426 -
0.0399 3990 1.553 -
0.04 4000 1.5766 0.5803
0.0401 4010 1.571 -
0.0402 4020 1.5331 -
0.0403 4030 1.5539 -
0.0404 4040 1.5608 -
0.0405 4050 1.5583 -
0.0406 4060 1.5277 -
0.0407 4070 1.5358 -
0.0408 4080 1.5524 -
0.0409 4090 1.559 -
0.041 4100 1.5683 0.5860
0.0411 4110 1.5497 -
0.0412 4120 1.5762 -
0.0413 4130 1.5585 -
0.0414 4140 1.5505 -
0.0415 4150 1.5411 -
0.0416 4160 1.5459 -
0.0417 4170 1.5625 -
0.0418 4180 1.5425 -
0.0419 4190 1.5554 -
0.042 4200 1.5193 0.5806
0.0421 4210 1.5098 -
0.0422 4220 1.5474 -
0.0423 4230 1.5352 -
0.0424 4240 1.5536 -
0.0425 4250 1.5549 -
0.0426 4260 1.5207 -
0.0427 4270 1.5641 -
0.0428 4280 1.5398 -
0.0429 4290 1.5419 -
0.043 4300 1.5311 0.5869
0.0431 4310 1.5437 -
0.0432 4320 1.5334 -
0.0433 4330 1.5271 -
0.0434 4340 1.5199 -
0.0435 4350 1.5526 -
0.0436 4360 1.5478 -
0.0437 4370 1.518 -
0.0438 4380 1.5441 -
0.0439 4390 1.5341 -
0.044 4400 1.542 0.5851
0.0441 4410 1.5381 -
0.0442 4420 1.5142 -
0.0443 4430 1.5514 -
0.0444 4440 1.5396 -
0.0445 4450 1.526 -
0.0446 4460 1.5374 -
0.0447 4470 1.5231 -
0.0448 4480 1.53 -
0.0449 4490 1.5281 -
0.045 4500 1.526 0.5844
0.0451 4510 1.5333 -
0.0452 4520 1.5664 -
0.0453 4530 1.5342 -
0.0454 4540 1.5438 -
0.0455 4550 1.5302 -
0.0456 4560 1.5299 -
0.0457 4570 1.5402 -
0.0458 4580 1.5455 -
0.0459 4590 1.5323 -
0.046 4600 1.5086 0.5748
0.0461 4610 1.5361 -
0.0462 4620 1.5181 -
0.0463 4630 1.5567 -
0.0464 4640 1.5329 -
0.0465 4650 1.5218 -
0.0466 4660 1.5386 -
0.0467 4670 1.5299 -
0.0468 4680 1.5175 -
0.0469 4690 1.5482 -
0.047 4700 1.5186 0.5768
0.0471 4710 1.5537 -
0.0472 4720 1.5023 -
0.0473 4730 1.546 -
0.0474 4740 1.5342 -
0.0475 4750 1.5237 -
0.0476 4760 1.5347 -
0.0477 4770 1.5358 -
0.0478 4780 1.5421 -
0.0479 4790 1.5263 -
0.048 4800 1.5227 0.5840
0.0481 4810 1.5238 -
0.0482 4820 1.5106 -
0.0483 4830 1.5257 -
0.0484 4840 1.5272 -
0.0485 4850 1.5218 -
0.0486 4860 1.5156 -
0.0487 4870 1.5325 -
0.0488 4880 1.5442 -
0.0489 4890 1.5442 -
0.049 4900 1.533 0.5726
0.0491 4910 1.5252 -
0.0492 4920 1.5163 -
0.0493 4930 1.5229 -
0.0494 4940 1.5241 -
0.0495 4950 1.5177 -
0.0496 4960 1.5359 -
0.0497 4970 1.5362 -
0.0498 4980 1.5192 -
0.0499 4990 1.518 -
0.05 5000 1.5283 0.5862
0.0501 5010 1.5571 -
0.0502 5020 1.5389 -
0.0503 5030 1.5133 -
0.0504 5040 1.5158 -
0.0505 5050 1.5297 -
0.0506 5060 1.5509 -
0.0507 5070 1.501 -
0.0508 5080 1.5143 -
0.0509 5090 1.5308 -
0.051 5100 1.5395 0.5756
0.0511 5110 1.5036 -
0.0512 5120 1.5277 -
0.0513 5130 1.5061 -
0.0514 5140 1.5062 -
0.0515 5150 1.504 -
0.0516 5160 1.5281 -
0.0517 5170 1.5255 -
0.0518 5180 1.5 -
0.0519 5190 1.5259 -
0.052 5200 1.527 0.5816
0.0521 5210 1.5186 -
0.0522 5220 1.5154 -
0.0523 5230 1.5195 -
0.0524 5240 1.5075 -
0.0525 5250 1.5263 -
0.0526 5260 1.5234 -
0.0527 5270 1.5135 -
0.0528 5280 1.5358 -
0.0529 5290 1.5168 -
0.053 5300 1.5179 0.5807
0.0531 5310 1.5233 -
0.0532 5320 1.5481 -
0.0533 5330 1.5268 -
0.0534 5340 1.5271 -
0.0535 5350 1.4983 -
0.0536 5360 1.5225 -
0.0537 5370 1.53 -
0.0538 5380 1.5065 -
0.0539 5390 1.5263 -
0.054 5400 1.5085 0.5796
0.0541 5410 1.5305 -
0.0542 5420 1.5397 -
0.0543 5430 1.5234 -
0.0544 5440 1.552 -
0.0545 5450 1.5103 -
0.0546 5460 1.5328 -
0.0547 5470 1.5057 -
0.0548 5480 1.5222 -
0.0549 5490 1.5196 -
0.055 5500 1.5166 0.5809
0.0551 5510 1.5084 -
0.0552 5520 1.511 -
0.0553 5530 1.5288 -
0.0554 5540 1.514 -
0.0555 5550 1.5085 -
0.0556 5560 1.5317 -
0.0557 5570 1.5045 -
0.0558 5580 1.5029 -
0.0559 5590 1.5074 -
0.056 5600 1.5273 0.5897
0.0561 5610 1.5231 -
0.0562 5620 1.5076 -
0.0563 5630 1.5214 -
0.0564 5640 1.494 -
0.0565 5650 1.4888 -
0.0566 5660 1.5198 -
0.0567 5670 1.5032 -
0.0568 5680 1.5044 -
0.0569 5690 1.5353 -
0.057 5700 1.5278 0.5819
0.0571 5710 1.5271 -
0.0572 5720 1.5428 -
0.0573 5730 1.5173 -
0.0574 5740 1.5014 -
0.0575 5750 1.5198 -
0.0576 5760 1.5086 -
0.0577 5770 1.543 -
0.0578 5780 1.503 -
0.0579 5790 1.5198 -
0.058 5800 1.4969 0.5786
0.0581 5810 1.5054 -
0.0582 5820 1.5446 -
0.0583 5830 1.5164 -
0.0584 5840 1.5278 -
0.0585 5850 1.4902 -
0.0586 5860 1.5271 -
0.0587 5870 1.5026 -
0.0588 5880 1.5284 -
0.0589 5890 1.5023 -
0.059 5900 1.5138 0.5839
0.0591 5910 1.5136 -
0.0592 5920 1.5308 -
0.0593 5930 1.4936 -
0.0594 5940 1.5274 -
0.0595 5950 1.4917 -
0.0596 5960 1.5362 -
0.0597 5970 1.5135 -
0.0598 5980 1.5217 -
0.0599 5990 1.5234 -
0.06 6000 1.5168 0.5768
0.0601 6010 1.5203 -
0.0602 6020 1.5263 -
0.0603 6030 1.5236 -
0.0604 6040 1.5128 -
0.0605 6050 1.521 -
0.0606 6060 1.515 -
0.0607 6070 1.5111 -
0.0608 6080 1.523 -
0.0609 6090 1.4804 -
0.061 6100 1.5065 0.5790

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.4.2
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
20
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including reasonwang/embedding-qwen3-1.7b-embedding_ctxt_ac1_unicode_shuf

Papers for reasonwang/embedding-qwen3-1.7b-embedding_ctxt_ac1_unicode_shuf

Evaluation results