SentenceTransformer

This is a sentence-transformers model trained on the generator dataset. It maps sentences & paragraphs to a 4096-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 32768 tokens
  • Output Dimensionality: 4096 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • generator

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 32768, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("reasonwang/embedding-qwen3-1.7b-embedding_ctxt_unicode_shuf")
# Run inference
sentences = [
    'The weather is lovely today.',
    "It's so sunny outside!",
    'He drove to the stadium.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 4096]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8640, 0.8773],
#         [0.8640, 1.0000, 0.7820],
#         [0.8773, 0.7820, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.57
cosine_accuracy@3 0.82
cosine_accuracy@5 0.89
cosine_accuracy@10 0.94
cosine_precision@1 0.57
cosine_precision@3 0.4767
cosine_precision@5 0.446
cosine_precision@10 0.327
cosine_recall@1 0.1145
cosine_recall@3 0.2377
cosine_recall@5 0.3327
cosine_recall@10 0.4088
cosine_ndcg@10 0.508
cosine_ndcg@100 0.5643
cosine_mrr@10 0.7045
cosine_mrr@100 0.7066
cosine_map@100 0.3787

Training Details

Training Dataset

generator

  • Dataset: generator
  • Columns: sentence1 and sentence2
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 4,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • learning_rate: 2e-05
  • max_steps: 100000
  • log_level: info
  • bf16: True
  • dataloader_num_workers: 1
  • accelerator_config: {'split_batches': False, 'dispatch_batches': False, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 100000
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: info
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 1
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': False, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss validation_retrieval_cosine_ndcg@100
1e-05 1 5.3032 -
0.0001 10 3.7433 -
0.0002 20 2.7632 -
0.0003 30 2.4609 -
0.0004 40 2.303 -
0.0005 50 2.2197 -
0.0006 60 2.1855 -
0.0007 70 2.1517 -
0.0008 80 2.1111 -
0.0009 90 2.0876 -
0.001 100 2.0613 0.4992
0.0011 110 2.0348 -
0.0012 120 2.0209 -
0.0013 130 2.0293 -
0.0014 140 2.0281 -
0.0015 150 2.0019 -
0.0016 160 1.9692 -
0.0017 170 1.9849 -
0.0018 180 1.9494 -
0.0019 190 1.9458 -
0.002 200 1.9264 0.5104
0.0021 210 1.9644 -
0.0022 220 1.9315 -
0.0023 230 1.9324 -
0.0024 240 1.8965 -
0.0025 250 1.9287 -
0.0026 260 1.9246 -
0.0027 270 1.9087 -
0.0028 280 1.9052 -
0.0029 290 1.8969 -
0.003 300 1.8971 0.5164
0.0031 310 1.8896 -
0.0032 320 1.8897 -
0.0033 330 1.8646 -
0.0034 340 1.8825 -
0.0035 350 1.8599 -
0.0036 360 1.8583 -
0.0037 370 1.8649 -
0.0038 380 1.8647 -
0.0039 390 1.8759 -
0.004 400 1.8197 0.5253
0.0041 410 1.846 -
0.0042 420 1.841 -
0.0043 430 1.8319 -
0.0044 440 1.835 -
0.0045 450 1.807 -
0.0046 460 1.8406 -
0.0047 470 1.8344 -
0.0048 480 1.8003 -
0.0049 490 1.8155 -
0.005 500 1.8242 0.5266
0.0051 510 1.8014 -
0.0052 520 1.8026 -
0.0053 530 1.8042 -
0.0054 540 1.8372 -
0.0055 550 1.8054 -
0.0056 560 1.8093 -
0.0057 570 1.7814 -
0.0058 580 1.7875 -
0.0059 590 1.7844 -
0.006 600 1.7789 0.5330
0.0061 610 1.7947 -
0.0062 620 1.8084 -
0.0063 630 1.7806 -
0.0064 640 1.772 -
0.0065 650 1.7948 -
0.0066 660 1.7648 -
0.0067 670 1.7801 -
0.0068 680 1.7801 -
0.0069 690 1.7696 -
0.007 700 1.7848 0.5457
0.0071 710 1.774 -
0.0072 720 1.7619 -
0.0073 730 1.7422 -
0.0074 740 1.7594 -
0.0075 750 1.7225 -
0.0076 760 1.7601 -
0.0077 770 1.7432 -
0.0078 780 1.7627 -
0.0079 790 1.749 -
0.008 800 1.7361 0.5455
0.0081 810 1.7275 -
0.0082 820 1.7391 -
0.0083 830 1.7403 -
0.0084 840 1.736 -
0.0085 850 1.7297 -
0.0086 860 1.7376 -
0.0087 870 1.7242 -
0.0088 880 1.7231 -
0.0089 890 1.729 -
0.009 900 1.7515 0.5473
0.0091 910 1.7269 -
0.0092 920 1.6863 -
0.0093 930 1.7164 -
0.0094 940 1.7347 -
0.0095 950 1.7439 -
0.0096 960 1.7102 -
0.0097 970 1.7129 -
0.0098 980 1.7185 -
0.0099 990 1.7131 -
0.01 1000 1.7309 0.5527
0.0101 1010 1.7055 -
0.0102 1020 1.7106 -
0.0103 1030 1.7089 -
0.0104 1040 1.7058 -
0.0105 1050 1.6984 -
0.0106 1060 1.69 -
0.0107 1070 1.7189 -
0.0108 1080 1.7147 -
0.0109 1090 1.7237 -
0.011 1100 1.6781 0.5567
0.0111 1110 1.6788 -
0.0112 1120 1.6928 -
0.0113 1130 1.7146 -
0.0114 1140 1.6983 -
0.0115 1150 1.7014 -
0.0116 1160 1.6888 -
0.0117 1170 1.6668 -
0.0118 1180 1.6785 -
0.0119 1190 1.6853 -
0.012 1200 1.7077 0.5459
0.0121 1210 1.676 -
0.0122 1220 1.6749 -
0.0123 1230 1.6815 -
0.0124 1240 1.6823 -
0.0125 1250 1.6751 -
0.0126 1260 1.6942 -
0.0127 1270 1.6597 -
0.0128 1280 1.6685 -
0.0129 1290 1.6873 -
0.013 1300 1.6779 0.5526
0.0131 1310 1.6676 -
0.0132 1320 1.6721 -
0.0133 1330 1.6713 -
0.0134 1340 1.6618 -
0.0135 1350 1.6387 -
0.0136 1360 1.6951 -
0.0137 1370 1.6669 -
0.0138 1380 1.6477 -
0.0139 1390 1.6856 -
0.014 1400 1.6687 0.5528
0.0141 1410 1.6578 -
0.0142 1420 1.6588 -
0.0143 1430 1.6552 -
0.0144 1440 1.6643 -
0.0145 1450 1.6543 -
0.0146 1460 1.6851 -
0.0147 1470 1.6547 -
0.0148 1480 1.6744 -
0.0149 1490 1.6694 -
0.015 1500 1.6795 0.5537
0.0151 1510 1.656 -
0.0152 1520 1.6425 -
0.0153 1530 1.6545 -
0.0154 1540 1.614 -
0.0155 1550 1.6554 -
0.0156 1560 1.6542 -
0.0157 1570 1.6676 -
0.0158 1580 1.6615 -
0.0159 1590 1.6374 -
0.016 1600 1.6451 0.5613
0.0161 1610 1.6258 -
0.0162 1620 1.6504 -
0.0163 1630 1.6254 -
0.0164 1640 1.6257 -
0.0165 1650 1.6392 -
0.0166 1660 1.6365 -
0.0167 1670 1.6407 -
0.0168 1680 1.6313 -
0.0169 1690 1.6458 -
0.017 1700 1.6405 0.5526
0.0171 1710 1.6431 -
0.0172 1720 1.6262 -
0.0173 1730 1.6434 -
0.0174 1740 1.6404 -
0.0175 1750 1.6418 -
0.0176 1760 1.6176 -
0.0177 1770 1.6282 -
0.0178 1780 1.6228 -
0.0179 1790 1.656 -
0.018 1800 1.6392 0.5499
0.0181 1810 1.6307 -
0.0182 1820 1.6147 -
0.0183 1830 1.6225 -
0.0184 1840 1.6387 -
0.0185 1850 1.6173 -
0.0186 1860 1.6535 -
0.0187 1870 1.6339 -
0.0188 1880 1.6215 -
0.0189 1890 1.6048 -
0.019 1900 1.6278 0.5527
0.0191 1910 1.6359 -
0.0192 1920 1.6142 -
0.0193 1930 1.6354 -
0.0194 1940 1.6341 -
0.0195 1950 1.6352 -
0.0196 1960 1.6223 -
0.0197 1970 1.6208 -
0.0198 1980 1.6151 -
0.0199 1990 1.5815 -
0.02 2000 1.6159 0.5573
0.0201 2010 1.6229 -
0.0202 2020 1.6156 -
0.0203 2030 1.6051 -
0.0204 2040 1.6411 -
0.0205 2050 1.6339 -
0.0206 2060 1.6241 -
0.0207 2070 1.6014 -
0.0208 2080 1.5942 -
0.0209 2090 1.611 -
0.021 2100 1.6065 0.5563
0.0211 2110 1.6208 -
0.0212 2120 1.6239 -
0.0213 2130 1.6066 -
0.0214 2140 1.5936 -
0.0215 2150 1.6008 -
0.0216 2160 1.6239 -
0.0217 2170 1.6116 -
0.0218 2180 1.6128 -
0.0219 2190 1.5819 -
0.022 2200 1.5915 0.5547
0.0221 2210 1.6164 -
0.0222 2220 1.6141 -
0.0223 2230 1.6296 -
0.0224 2240 1.6026 -
0.0225 2250 1.5958 -
0.0226 2260 1.6009 -
0.0227 2270 1.6336 -
0.0228 2280 1.6231 -
0.0229 2290 1.6163 -
0.023 2300 1.5811 0.5626
0.0231 2310 1.5951 -
0.0232 2320 1.5989 -
0.0233 2330 1.6056 -
0.0234 2340 1.5808 -
0.0235 2350 1.5741 -
0.0236 2360 1.5928 -
0.0237 2370 1.5921 -
0.0238 2380 1.6032 -
0.0239 2390 1.5779 -
0.024 2400 1.609 0.5637
0.0241 2410 1.5771 -
0.0242 2420 1.5902 -
0.0243 2430 1.5971 -
0.0244 2440 1.5969 -
0.0245 2450 1.6058 -
0.0246 2460 1.6161 -
0.0247 2470 1.5709 -
0.0248 2480 1.5814 -
0.0249 2490 1.5866 -
0.025 2500 1.5692 0.5642
0.0251 2510 1.584 -
0.0252 2520 1.5899 -
0.0253 2530 1.614 -
0.0254 2540 1.5966 -
0.0255 2550 1.5838 -
0.0256 2560 1.5969 -
0.0257 2570 1.5789 -
0.0258 2580 1.5938 -
0.0259 2590 1.5836 -
0.026 2600 1.579 0.5640
0.0261 2610 1.5978 -
0.0262 2620 1.5783 -
0.0263 2630 1.5842 -
0.0264 2640 1.6001 -
0.0265 2650 1.5798 -
0.0266 2660 1.6003 -
0.0267 2670 1.5868 -
0.0268 2680 1.603 -
0.0269 2690 1.5789 -
0.027 2700 1.5724 0.5674
0.0271 2710 1.5718 -
0.0272 2720 1.5771 -
0.0273 2730 1.5954 -
0.0274 2740 1.5687 -
0.0275 2750 1.5897 -
0.0276 2760 1.5533 -
0.0277 2770 1.5799 -
0.0278 2780 1.5741 -
0.0279 2790 1.6096 -
0.028 2800 1.5863 0.5568
0.0281 2810 1.6004 -
0.0282 2820 1.569 -
0.0283 2830 1.5757 -
0.0284 2840 1.5597 -
0.0285 2850 1.5935 -
0.0286 2860 1.5673 -
0.0287 2870 1.5725 -
0.0288 2880 1.5899 -
0.0289 2890 1.5683 -
0.029 2900 1.5519 0.5702
0.0291 2910 1.559 -
0.0292 2920 1.5692 -
0.0293 2930 1.5792 -
0.0294 2940 1.5704 -
0.0295 2950 1.5717 -
0.0296 2960 1.5535 -
0.0297 2970 1.553 -
0.0298 2980 1.5629 -
0.0299 2990 1.5636 -
0.03 3000 1.5715 0.5681
0.0301 3010 1.5538 -
0.0302 3020 1.5803 -
0.0303 3030 1.5535 -
0.0304 3040 1.5674 -
0.0305 3050 1.5465 -
0.0306 3060 1.5682 -
0.0307 3070 1.5855 -
0.0308 3080 1.559 -
0.0309 3090 1.559 -
0.031 3100 1.5773 0.5707
0.0311 3110 1.5693 -
0.0312 3120 1.5643 -
0.0313 3130 1.5586 -
0.0314 3140 1.5453 -
0.0315 3150 1.5799 -
0.0316 3160 1.5532 -
0.0317 3170 1.5459 -
0.0318 3180 1.5541 -
0.0319 3190 1.5789 -
0.032 3200 1.5331 0.5595
0.0321 3210 1.5521 -
0.0322 3220 1.5553 -
0.0323 3230 1.5675 -
0.0324 3240 1.551 -
0.0325 3250 1.5753 -
0.0326 3260 1.5625 -
0.0327 3270 1.5782 -
0.0328 3280 1.5588 -
0.0329 3290 1.5795 -
0.033 3300 1.5529 0.5654
0.0331 3310 1.5581 -
0.0332 3320 1.5828 -
0.0333 3330 1.5628 -
0.0334 3340 1.5614 -
0.0335 3350 1.5645 -
0.0336 3360 1.5405 -
0.0337 3370 1.5743 -
0.0338 3380 1.5393 -
0.0339 3390 1.5547 -
0.034 3400 1.5403 0.5616
0.0341 3410 1.5627 -
0.0342 3420 1.5638 -
0.0343 3430 1.5664 -
0.0344 3440 1.5345 -
0.0345 3450 1.5546 -
0.0346 3460 1.5581 -
0.0347 3470 1.5614 -
0.0348 3480 1.558 -
0.0349 3490 1.5451 -
0.035 3500 1.5491 0.5581
0.0351 3510 1.5357 -
0.0352 3520 1.5578 -
0.0353 3530 1.5433 -
0.0354 3540 1.5343 -
0.0355 3550 1.5558 -
0.0356 3560 1.5711 -
0.0357 3570 1.5458 -
0.0358 3580 1.5356 -
0.0359 3590 1.559 -
0.036 3600 1.5338 0.5598
0.0361 3610 1.5532 -
0.0362 3620 1.5346 -
0.0363 3630 1.5558 -
0.0364 3640 1.539 -
0.0365 3650 1.538 -
0.0366 3660 1.5638 -
0.0367 3670 1.5666 -
0.0368 3680 1.5662 -
0.0369 3690 1.5432 -
0.037 3700 1.5345 0.5680
0.0371 3710 1.5524 -
0.0372 3720 1.5617 -
0.0373 3730 1.5261 -
0.0374 3740 1.5502 -
0.0375 3750 1.5452 -
0.0376 3760 1.5566 -
0.0377 3770 1.5457 -
0.0378 3780 1.5307 -
0.0379 3790 1.5331 -
0.038 3800 1.5294 0.5578
0.0381 3810 1.5389 -
0.0382 3820 1.5379 -
0.0383 3830 1.5578 -
0.0384 3840 1.5259 -
0.0385 3850 1.5308 -
0.0386 3860 1.5461 -
0.0387 3870 1.5197 -
0.0388 3880 1.5332 -
0.0389 3890 1.5642 -
0.039 3900 1.5256 0.5625
0.0391 3910 1.5608 -
0.0392 3920 1.5567 -
0.0393 3930 1.5278 -
0.0394 3940 1.5404 -
0.0395 3950 1.5367 -
0.0396 3960 1.5186 -
0.0397 3970 1.5437 -
0.0398 3980 1.5459 -
0.0399 3990 1.5536 -
0.04 4000 1.548 0.5642
0.0401 4010 1.5407 -
0.0402 4020 1.5235 -
0.0403 4030 1.526 -
0.0404 4040 1.5184 -
0.0405 4050 1.5232 -
0.0406 4060 1.5215 -
0.0407 4070 1.5202 -
0.0408 4080 1.5325 -
0.0409 4090 1.5317 -
0.041 4100 1.5326 0.5689
0.0411 4110 1.5083 -
0.0412 4120 1.5158 -
0.0413 4130 1.5321 -
0.0414 4140 1.5383 -
0.0415 4150 1.5432 -
0.0416 4160 1.503 -
0.0417 4170 1.5374 -
0.0418 4180 1.5166 -
0.0419 4190 1.5462 -
0.042 4200 1.5175 0.5650
0.0421 4210 1.5348 -
0.0422 4220 1.5613 -
0.0423 4230 1.521 -
0.0424 4240 1.5377 -
0.0425 4250 1.5163 -
0.0426 4260 1.5354 -
0.0427 4270 1.5181 -
0.0428 4280 1.5381 -
0.0429 4290 1.5311 -
0.043 4300 1.5074 0.5688
0.0431 4310 1.5162 -
0.0432 4320 1.5051 -
0.0433 4330 1.5171 -
0.0434 4340 1.5283 -
0.0435 4350 1.5171 -
0.0436 4360 1.5377 -
0.0437 4370 1.5197 -
0.0438 4380 1.513 -
0.0439 4390 1.5418 -
0.044 4400 1.5135 0.5644
0.0441 4410 1.522 -
0.0442 4420 1.5286 -
0.0443 4430 1.5328 -
0.0444 4440 1.5354 -
0.0445 4450 1.5252 -
0.0446 4460 1.5127 -
0.0447 4470 1.5116 -
0.0448 4480 1.5237 -
0.0449 4490 1.5265 -
0.045 4500 1.5298 0.5649
0.0451 4510 1.5349 -
0.0452 4520 1.4997 -
0.0453 4530 1.4947 -
0.0454 4540 1.5186 -
0.0455 4550 1.487 -
0.0456 4560 1.5088 -
0.0457 4570 1.5422 -
0.0458 4580 1.4962 -
0.0459 4590 1.5193 -
0.046 4600 1.5306 0.5608
0.0461 4610 1.536 -
0.0462 4620 1.5334 -
0.0463 4630 1.5598 -
0.0464 4640 1.5223 -
0.0465 4650 1.5223 -
0.0466 4660 1.5277 -
0.0467 4670 1.5381 -
0.0468 4680 1.5416 -
0.0469 4690 1.5056 -
0.047 4700 1.5077 0.5655
0.0471 4710 1.5045 -
0.0472 4720 1.5135 -
0.0473 4730 1.5284 -
0.0474 4740 1.5331 -
0.0475 4750 1.5194 -
0.0476 4760 1.5286 -
0.0477 4770 1.536 -
0.0478 4780 1.4984 -
0.0479 4790 1.5086 -
0.048 4800 1.5137 0.5703
0.0481 4810 1.5421 -
0.0482 4820 1.5131 -
0.0483 4830 1.5084 -
0.0484 4840 1.5006 -
0.0485 4850 1.5141 -
0.0486 4860 1.503 -
0.0487 4870 1.511 -
0.0488 4880 1.5175 -
0.0489 4890 1.5088 -
0.049 4900 1.5019 0.5711
0.0491 4910 1.5359 -
0.0492 4920 1.5218 -
0.0493 4930 1.5043 -
0.0494 4940 1.5059 -
0.0495 4950 1.4943 -
0.0496 4960 1.5269 -
0.0497 4970 1.517 -
0.0498 4980 1.5135 -
0.0499 4990 1.5204 -
0.05 5000 1.4983 0.5700
0.0501 5010 1.5271 -
0.0502 5020 1.4929 -
0.0503 5030 1.4947 -
0.0504 5040 1.4883 -
0.0505 5050 1.523 -
0.0506 5060 1.5092 -
0.0507 5070 1.5262 -
0.0508 5080 1.4859 -
0.0509 5090 1.5059 -
0.051 5100 1.5293 0.5677
0.0511 5110 1.4962 -
0.0512 5120 1.5192 -
0.0513 5130 1.5115 -
0.0514 5140 1.5152 -
0.0515 5150 1.4948 -
0.0516 5160 1.5376 -
0.0517 5170 1.5015 -
0.0518 5180 1.5119 -
0.0519 5190 1.4926 -
0.052 5200 1.5235 0.5663
0.0521 5210 1.5158 -
0.0522 5220 1.5072 -
0.0523 5230 1.5264 -
0.0524 5240 1.5026 -
0.0525 5250 1.5042 -
0.0526 5260 1.5096 -
0.0527 5270 1.5022 -
0.0528 5280 1.5038 -
0.0529 5290 1.4903 -
0.053 5300 1.5284 0.5684
0.0531 5310 1.5009 -
0.0532 5320 1.505 -
0.0533 5330 1.5288 -
0.0534 5340 1.501 -
0.0535 5350 1.5143 -
0.0536 5360 1.5071 -
0.0537 5370 1.4976 -
0.0538 5380 1.5092 -
0.0539 5390 1.5082 -
0.054 5400 1.5056 0.5716
0.0541 5410 1.4934 -
0.0542 5420 1.5159 -
0.0543 5430 1.5059 -
0.0544 5440 1.4937 -
0.0545 5450 1.5223 -
0.0546 5460 1.4989 -
0.0547 5470 1.5149 -
0.0548 5480 1.5003 -
0.0549 5490 1.521 -
0.055 5500 1.4959 0.5779
0.0551 5510 1.5074 -
0.0552 5520 1.5071 -
0.0553 5530 1.5173 -
0.0554 5540 1.5111 -
0.0555 5550 1.5017 -
0.0556 5560 1.5296 -
0.0557 5570 1.5147 -
0.0558 5580 1.524 -
0.0559 5590 1.4936 -
0.056 5600 1.5111 0.5684
0.0561 5610 1.5147 -
0.0562 5620 1.5002 -
0.0563 5630 1.5048 -
0.0564 5640 1.5093 -
0.0565 5650 1.5093 -
0.0566 5660 1.4795 -
0.0567 5670 1.5149 -
0.0568 5680 1.4881 -
0.0569 5690 1.4986 -
0.057 5700 1.4929 0.5692
0.0571 5710 1.5186 -
0.0572 5720 1.4938 -
0.0573 5730 1.4943 -
0.0574 5740 1.4926 -
0.0575 5750 1.4672 -
0.0576 5760 1.5036 -
0.0577 5770 1.511 -
0.0578 5780 1.4892 -
0.0579 5790 1.4983 -
0.058 5800 1.4914 0.5704
0.0581 5810 1.4883 -
0.0582 5820 1.5052 -
0.0583 5830 1.5066 -
0.0584 5840 1.4904 -
0.0585 5850 1.5114 -
0.0586 5860 1.4984 -
0.0587 5870 1.4827 -
0.0588 5880 1.4676 -
0.0589 5890 1.514 -
0.059 5900 1.509 0.5688
0.0591 5910 1.5094 -
0.0592 5920 1.4902 -
0.0593 5930 1.4849 -
0.0594 5940 1.5159 -
0.0595 5950 1.5012 -
0.0596 5960 1.5068 -
0.0597 5970 1.5054 -
0.0598 5980 1.4722 -
0.0599 5990 1.4975 -
0.06 6000 1.4843 0.5623
0.0601 6010 1.4726 -
0.0602 6020 1.517 -
0.0603 6030 1.4957 -
0.0604 6040 1.508 -
0.0605 6050 1.5113 -
0.0606 6060 1.4903 -
0.0607 6070 1.4761 -
0.0608 6080 1.5226 -
0.0609 6090 1.5228 -
0.061 6100 1.4836 0.5643
0.0611 6110 1.4926 -
0.0612 6120 1.4968 -
0.0613 6130 1.4954 -
0.0614 6140 1.5209 -
0.0615 6150 1.4857 -
0.0616 6160 1.4881 -
0.0617 6170 1.504 -
0.0618 6180 1.464 -
0.0619 6190 1.5003 -
0.062 6200 1.4858 0.5643

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.4.2
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
20
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including reasonwang/embedding-qwen3-1.7b-embedding_ctxt_unicode_shuf

Papers for reasonwang/embedding-qwen3-1.7b-embedding_ctxt_unicode_shuf

Evaluation results