SentenceTransformer

This is a sentence-transformers model trained on the generator dataset. It maps sentences & paragraphs to a 4096-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 32768 tokens
  • Output Dimensionality: 4096 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • generator

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 32768, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("reasonwang/embedding-qwen3-1.7b-embedding_ac1_unicode_shuf")
# Run inference
sentences = [
    'The weather is lovely today.',
    "It's so sunny outside!",
    'He drove to the stadium.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 4096]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8993, 0.7656],
#         [0.8993, 1.0000, 0.6804],
#         [0.7656, 0.6804, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.62
cosine_accuracy@3 0.79
cosine_accuracy@5 0.88
cosine_accuracy@10 0.93
cosine_precision@1 0.62
cosine_precision@3 0.4733
cosine_precision@5 0.406
cosine_precision@10 0.292
cosine_recall@1 0.1206
cosine_recall@3 0.2301
cosine_recall@5 0.3083
cosine_recall@10 0.3879
cosine_ndcg@10 0.4788
cosine_ndcg@100 0.551
cosine_mrr@10 0.7219
cosine_mrr@100 0.725
cosine_map@100 0.3511

Training Details

Training Dataset

generator

  • Dataset: generator
  • Columns: sentence1 and sentence2
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 4,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • learning_rate: 2e-05
  • max_steps: 100000
  • log_level: info
  • bf16: True
  • dataloader_num_workers: 1
  • accelerator_config: {'split_batches': False, 'dispatch_batches': False, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 100000
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: info
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 1
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': False, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss validation_retrieval_cosine_ndcg@100
1e-05 1 5.2053 -
0.0001 10 3.8468 -
0.0002 20 2.8407 -
0.0003 30 2.5497 -
0.0004 40 2.4161 -
0.0005 50 2.3331 -
0.0006 60 2.3184 -
0.0007 70 2.2386 -
0.0008 80 2.2135 -
0.0009 90 2.1836 -
0.001 100 2.1797 0.4941
0.0011 110 2.1426 -
0.0012 120 2.1384 -
0.0013 130 2.1306 -
0.0014 140 2.1141 -
0.0015 150 2.1241 -
0.0016 160 2.0885 -
0.0017 170 2.0815 -
0.0018 180 2.099 -
0.0019 190 2.0519 -
0.002 200 2.0482 0.5110
0.0021 210 2.0405 -
0.0022 220 2.0122 -
0.0023 230 2.025 -
0.0024 240 2.011 -
0.0025 250 2.0368 -
0.0026 260 1.9879 -
0.0027 270 2.0191 -
0.0028 280 1.9754 -
0.0029 290 2.0114 -
0.003 300 1.9769 0.5158
0.0031 310 1.9594 -
0.0032 320 1.9792 -
0.0033 330 1.9615 -
0.0034 340 1.955 -
0.0035 350 1.9725 -
0.0036 360 1.9524 -
0.0037 370 1.9435 -
0.0038 380 1.9398 -
0.0039 390 1.9251 -
0.004 400 1.8997 0.5228
0.0041 410 1.9296 -
0.0042 420 1.9267 -
0.0043 430 1.9301 -
0.0044 440 1.9417 -
0.0045 450 1.9179 -
0.0046 460 1.9091 -
0.0047 470 1.9147 -
0.0048 480 1.9187 -
0.0049 490 1.9345 -
0.005 500 1.883 0.5258
0.0051 510 1.9023 -
0.0052 520 1.8664 -
0.0053 530 1.9142 -
0.0054 540 1.8902 -
0.0055 550 1.877 -
0.0056 560 1.8753 -
0.0057 570 1.8804 -
0.0058 580 1.8923 -
0.0059 590 1.8501 -
0.006 600 1.8499 0.5265
0.0061 610 1.8719 -
0.0062 620 1.8692 -
0.0063 630 1.8813 -
0.0064 640 1.8758 -
0.0065 650 1.8721 -
0.0066 660 1.8466 -
0.0067 670 1.8873 -
0.0068 680 1.8429 -
0.0069 690 1.8719 -
0.007 700 1.8478 0.5271
0.0071 710 1.8616 -
0.0072 720 1.8486 -
0.0073 730 1.8766 -
0.0074 740 1.8614 -
0.0075 750 1.8722 -
0.0076 760 1.829 -
0.0077 770 1.8341 -
0.0078 780 1.8304 -
0.0079 790 1.8494 -
0.008 800 1.838 0.5328
0.0081 810 1.851 -
0.0082 820 1.8359 -
0.0083 830 1.8528 -
0.0084 840 1.8295 -
0.0085 850 1.8337 -
0.0086 860 1.8072 -
0.0087 870 1.8068 -
0.0088 880 1.8102 -
0.0089 890 1.8199 -
0.009 900 1.8308 0.5324
0.0091 910 1.8155 -
0.0092 920 1.7918 -
0.0093 930 1.7969 -
0.0094 940 1.8054 -
0.0095 950 1.8032 -
0.0096 960 1.7972 -
0.0097 970 1.8063 -
0.0098 980 1.8227 -
0.0099 990 1.8111 -
0.01 1000 1.7909 0.5312
0.0101 1010 1.8105 -
0.0102 1020 1.7976 -
0.0103 1030 1.8117 -
0.0104 1040 1.7823 -
0.0105 1050 1.7967 -
0.0106 1060 1.7941 -
0.0107 1070 1.8054 -
0.0108 1080 1.7938 -
0.0109 1090 1.784 -
0.011 1100 1.8 0.5342
0.0111 1110 1.7895 -
0.0112 1120 1.805 -
0.0113 1130 1.8063 -
0.0114 1140 1.8011 -
0.0115 1150 1.7609 -
0.0116 1160 1.7658 -
0.0117 1170 1.7393 -
0.0118 1180 1.7716 -
0.0119 1190 1.7546 -
0.012 1200 1.776 0.5351
0.0121 1210 1.7735 -
0.0122 1220 1.7815 -
0.0123 1230 1.7805 -
0.0124 1240 1.7677 -
0.0125 1250 1.7697 -
0.0126 1260 1.7599 -
0.0127 1270 1.752 -
0.0128 1280 1.7591 -
0.0129 1290 1.7519 -
0.013 1300 1.7782 0.5344
0.0131 1310 1.741 -
0.0132 1320 1.7475 -
0.0133 1330 1.7903 -
0.0134 1340 1.7488 -
0.0135 1350 1.7426 -
0.0136 1360 1.7589 -
0.0137 1370 1.7327 -
0.0138 1380 1.7453 -
0.0139 1390 1.7453 -
0.014 1400 1.7347 0.5345
0.0141 1410 1.7467 -
0.0142 1420 1.743 -
0.0143 1430 1.7587 -
0.0144 1440 1.73 -
0.0145 1450 1.7453 -
0.0146 1460 1.7387 -
0.0147 1470 1.7431 -
0.0148 1480 1.7542 -
0.0149 1490 1.764 -
0.015 1500 1.7297 0.5376
0.0151 1510 1.7404 -
0.0152 1520 1.7497 -
0.0153 1530 1.756 -
0.0154 1540 1.7426 -
0.0155 1550 1.7246 -
0.0156 1560 1.7217 -
0.0157 1570 1.7319 -
0.0158 1580 1.7642 -
0.0159 1590 1.7345 -
0.016 1600 1.7312 0.5360
0.0161 1610 1.7394 -
0.0162 1620 1.7025 -
0.0163 1630 1.7431 -
0.0164 1640 1.7216 -
0.0165 1650 1.7378 -
0.0166 1660 1.7217 -
0.0167 1670 1.7424 -
0.0168 1680 1.7349 -
0.0169 1690 1.7191 -
0.017 1700 1.7288 0.5380
0.0171 1710 1.7173 -
0.0172 1720 1.7215 -
0.0173 1730 1.7108 -
0.0174 1740 1.6972 -
0.0175 1750 1.7177 -
0.0176 1760 1.7259 -
0.0177 1770 1.7103 -
0.0178 1780 1.7286 -
0.0179 1790 1.7127 -
0.018 1800 1.7102 0.5373
0.0181 1810 1.6951 -
0.0182 1820 1.7221 -
0.0183 1830 1.7135 -
0.0184 1840 1.6976 -
0.0185 1850 1.719 -
0.0186 1860 1.7182 -
0.0187 1870 1.7 -
0.0188 1880 1.7307 -
0.0189 1890 1.7024 -
0.019 1900 1.7092 0.5395
0.0191 1910 1.6905 -
0.0192 1920 1.7068 -
0.0193 1930 1.7053 -
0.0194 1940 1.7104 -
0.0195 1950 1.7068 -
0.0196 1960 1.7136 -
0.0197 1970 1.672 -
0.0198 1980 1.7229 -
0.0199 1990 1.6988 -
0.02 2000 1.7009 0.5376
0.0201 2010 1.6763 -
0.0202 2020 1.7004 -
0.0203 2030 1.6916 -
0.0204 2040 1.7155 -
0.0205 2050 1.7055 -
0.0206 2060 1.6892 -
0.0207 2070 1.7236 -
0.0208 2080 1.6971 -
0.0209 2090 1.7105 -
0.021 2100 1.6989 0.5398
0.0211 2110 1.7052 -
0.0212 2120 1.6936 -
0.0213 2130 1.7055 -
0.0214 2140 1.7038 -
0.0215 2150 1.6902 -
0.0216 2160 1.6808 -
0.0217 2170 1.6859 -
0.0218 2180 1.7133 -
0.0219 2190 1.6992 -
0.022 2200 1.6929 0.5389
0.0221 2210 1.6918 -
0.0222 2220 1.6732 -
0.0223 2230 1.7048 -
0.0224 2240 1.6682 -
0.0225 2250 1.6733 -
0.0226 2260 1.6655 -
0.0227 2270 1.6712 -
0.0228 2280 1.6707 -
0.0229 2290 1.668 -
0.023 2300 1.7152 0.5413
0.0231 2310 1.6816 -
0.0232 2320 1.6642 -
0.0233 2330 1.6809 -
0.0234 2340 1.6925 -
0.0235 2350 1.679 -
0.0236 2360 1.6857 -
0.0237 2370 1.6958 -
0.0238 2380 1.6943 -
0.0239 2390 1.6487 -
0.024 2400 1.6818 0.5352
0.0241 2410 1.6752 -
0.0242 2420 1.6707 -
0.0243 2430 1.6891 -
0.0244 2440 1.6691 -
0.0245 2450 1.6786 -
0.0246 2460 1.6831 -
0.0247 2470 1.6788 -
0.0248 2480 1.69 -
0.0249 2490 1.6811 -
0.025 2500 1.6584 0.5425
0.0251 2510 1.6523 -
0.0252 2520 1.6947 -
0.0253 2530 1.6542 -
0.0254 2540 1.6411 -
0.0255 2550 1.6729 -
0.0256 2560 1.6851 -
0.0257 2570 1.6642 -
0.0258 2580 1.6614 -
0.0259 2590 1.6684 -
0.026 2600 1.6676 0.5402
0.0261 2610 1.6678 -
0.0262 2620 1.6741 -
0.0263 2630 1.6769 -
0.0264 2640 1.6468 -
0.0265 2650 1.6688 -
0.0266 2660 1.6543 -
0.0267 2670 1.7078 -
0.0268 2680 1.6353 -
0.0269 2690 1.665 -
0.027 2700 1.6629 0.5441
0.0271 2710 1.6567 -
0.0272 2720 1.6395 -
0.0273 2730 1.6629 -
0.0274 2740 1.6852 -
0.0275 2750 1.6635 -
0.0276 2760 1.6765 -
0.0277 2770 1.6598 -
0.0278 2780 1.6547 -
0.0279 2790 1.6741 -
0.028 2800 1.6664 0.5423
0.0281 2810 1.6446 -
0.0282 2820 1.6662 -
0.0283 2830 1.6566 -
0.0284 2840 1.6397 -
0.0285 2850 1.6728 -
0.0286 2860 1.6592 -
0.0287 2870 1.657 -
0.0288 2880 1.6498 -
0.0289 2890 1.6375 -
0.029 2900 1.6437 0.5376
0.0291 2910 1.6309 -
0.0292 2920 1.6665 -
0.0293 2930 1.6666 -
0.0294 2940 1.6476 -
0.0295 2950 1.6604 -
0.0296 2960 1.6595 -
0.0297 2970 1.6476 -
0.0298 2980 1.6716 -
0.0299 2990 1.629 -
0.03 3000 1.6383 0.5425
0.0301 3010 1.6545 -
0.0302 3020 1.6477 -
0.0303 3030 1.6137 -
0.0304 3040 1.6581 -
0.0305 3050 1.6696 -
0.0306 3060 1.6316 -
0.0307 3070 1.6439 -
0.0308 3080 1.6537 -
0.0309 3090 1.6466 -
0.031 3100 1.6388 0.5416
0.0311 3110 1.6467 -
0.0312 3120 1.6235 -
0.0313 3130 1.6201 -
0.0314 3140 1.6342 -
0.0315 3150 1.6487 -
0.0316 3160 1.6531 -
0.0317 3170 1.6552 -
0.0318 3180 1.65 -
0.0319 3190 1.6457 -
0.032 3200 1.6448 0.5427
0.0321 3210 1.641 -
0.0322 3220 1.6612 -
0.0323 3230 1.6482 -
0.0324 3240 1.6362 -
0.0325 3250 1.6299 -
0.0326 3260 1.642 -
0.0327 3270 1.6443 -
0.0328 3280 1.6362 -
0.0329 3290 1.6234 -
0.033 3300 1.6294 0.5409
0.0331 3310 1.6278 -
0.0332 3320 1.6201 -
0.0333 3330 1.6059 -
0.0334 3340 1.6609 -
0.0335 3350 1.6263 -
0.0336 3360 1.6113 -
0.0337 3370 1.6403 -
0.0338 3380 1.6425 -
0.0339 3390 1.6274 -
0.034 3400 1.6364 0.5377
0.0341 3410 1.6252 -
0.0342 3420 1.6144 -
0.0343 3430 1.6334 -
0.0344 3440 1.6172 -
0.0345 3450 1.6293 -
0.0346 3460 1.6308 -
0.0347 3470 1.6309 -
0.0348 3480 1.6178 -
0.0349 3490 1.6426 -
0.035 3500 1.6178 0.5466
0.0351 3510 1.6516 -
0.0352 3520 1.6249 -
0.0353 3530 1.6355 -
0.0354 3540 1.6451 -
0.0355 3550 1.611 -
0.0356 3560 1.6265 -
0.0357 3570 1.633 -
0.0358 3580 1.6222 -
0.0359 3590 1.616 -
0.036 3600 1.6226 0.5412
0.0361 3610 1.6251 -
0.0362 3620 1.6148 -
0.0363 3630 1.6348 -
0.0364 3640 1.6421 -
0.0365 3650 1.5937 -
0.0366 3660 1.6469 -
0.0367 3670 1.6259 -
0.0368 3680 1.6216 -
0.0369 3690 1.6206 -
0.037 3700 1.6159 0.5422
0.0371 3710 1.6185 -
0.0372 3720 1.6414 -
0.0373 3730 1.6179 -
0.0374 3740 1.6002 -
0.0375 3750 1.6226 -
0.0376 3760 1.6305 -
0.0377 3770 1.6198 -
0.0378 3780 1.6184 -
0.0379 3790 1.6445 -
0.038 3800 1.6291 0.5396
0.0381 3810 1.6029 -
0.0382 3820 1.6039 -
0.0383 3830 1.6351 -
0.0384 3840 1.6238 -
0.0385 3850 1.6086 -
0.0386 3860 1.6435 -
0.0387 3870 1.5971 -
0.0388 3880 1.6114 -
0.0389 3890 1.6077 -
0.039 3900 1.584 0.5387
0.0391 3910 1.6086 -
0.0392 3920 1.6165 -
0.0393 3930 1.611 -
0.0394 3940 1.6102 -
0.0395 3950 1.5823 -
0.0396 3960 1.6146 -
0.0397 3970 1.5876 -
0.0398 3980 1.605 -
0.0399 3990 1.629 -
0.04 4000 1.6349 0.5434
0.0401 4010 1.6273 -
0.0402 4020 1.6252 -
0.0403 4030 1.6426 -
0.0404 4040 1.6003 -
0.0405 4050 1.6135 -
0.0406 4060 1.5962 -
0.0407 4070 1.6133 -
0.0408 4080 1.5951 -
0.0409 4090 1.6001 -
0.041 4100 1.5997 0.5432
0.0411 4110 1.6117 -
0.0412 4120 1.6178 -
0.0413 4130 1.6151 -
0.0414 4140 1.6253 -
0.0415 4150 1.613 -
0.0416 4160 1.6197 -
0.0417 4170 1.5976 -
0.0418 4180 1.6213 -
0.0419 4190 1.5855 -
0.042 4200 1.6117 0.5443
0.0421 4210 1.6309 -
0.0422 4220 1.6047 -
0.0423 4230 1.6154 -
0.0424 4240 1.6001 -
0.0425 4250 1.619 -
0.0426 4260 1.6137 -
0.0427 4270 1.5896 -
0.0428 4280 1.622 -
0.0429 4290 1.6047 -
0.043 4300 1.6177 0.5408
0.0431 4310 1.6021 -
0.0432 4320 1.6219 -
0.0433 4330 1.5891 -
0.0434 4340 1.6268 -
0.0435 4350 1.6135 -
0.0436 4360 1.5985 -
0.0437 4370 1.6017 -
0.0438 4380 1.6061 -
0.0439 4390 1.619 -
0.044 4400 1.5897 0.5449
0.0441 4410 1.6126 -
0.0442 4420 1.6143 -
0.0443 4430 1.6213 -
0.0444 4440 1.6026 -
0.0445 4450 1.5946 -
0.0446 4460 1.6191 -
0.0447 4470 1.5861 -
0.0448 4480 1.6075 -
0.0449 4490 1.5854 -
0.045 4500 1.5773 0.5399
0.0451 4510 1.6214 -
0.0452 4520 1.6005 -
0.0453 4530 1.579 -
0.0454 4540 1.6039 -
0.0455 4550 1.5778 -
0.0456 4560 1.6141 -
0.0457 4570 1.6088 -
0.0458 4580 1.5981 -
0.0459 4590 1.5863 -
0.046 4600 1.5873 0.5438
0.0461 4610 1.595 -
0.0462 4620 1.5976 -
0.0463 4630 1.5772 -
0.0464 4640 1.5724 -
0.0465 4650 1.5905 -
0.0466 4660 1.6146 -
0.0467 4670 1.6097 -
0.0468 4680 1.6069 -
0.0469 4690 1.5915 -
0.047 4700 1.5934 0.5478
0.0471 4710 1.6011 -
0.0472 4720 1.6017 -
0.0473 4730 1.6104 -
0.0474 4740 1.5903 -
0.0475 4750 1.6011 -
0.0476 4760 1.5801 -
0.0477 4770 1.5784 -
0.0478 4780 1.6015 -
0.0479 4790 1.591 -
0.048 4800 1.6159 0.5480
0.0481 4810 1.5994 -
0.0482 4820 1.5687 -
0.0483 4830 1.5986 -
0.0484 4840 1.6035 -
0.0485 4850 1.5855 -
0.0486 4860 1.5647 -
0.0487 4870 1.602 -
0.0488 4880 1.5805 -
0.0489 4890 1.5876 -
0.049 4900 1.6101 0.5447
0.0491 4910 1.5817 -
0.0492 4920 1.5869 -
0.0493 4930 1.5883 -
0.0494 4940 1.5646 -
0.0495 4950 1.571 -
0.0496 4960 1.5814 -
0.0497 4970 1.5863 -
0.0498 4980 1.5599 -
0.0499 4990 1.6061 -
0.05 5000 1.5672 0.5471
0.0501 5010 1.5785 -
0.0502 5020 1.5813 -
0.0503 5030 1.5932 -
0.0504 5040 1.5807 -
0.0505 5050 1.5704 -
0.0506 5060 1.5986 -
0.0507 5070 1.5812 -
0.0508 5080 1.5922 -
0.0509 5090 1.5789 -
0.051 5100 1.5829 0.5490
0.0511 5110 1.5634 -
0.0512 5120 1.5989 -
0.0513 5130 1.5928 -
0.0514 5140 1.5686 -
0.0515 5150 1.5548 -
0.0516 5160 1.599 -
0.0517 5170 1.5904 -
0.0518 5180 1.5898 -
0.0519 5190 1.578 -
0.052 5200 1.5721 0.5484
0.0521 5210 1.5729 -
0.0522 5220 1.5905 -
0.0523 5230 1.5629 -
0.0524 5240 1.5498 -
0.0525 5250 1.577 -
0.0526 5260 1.5995 -
0.0527 5270 1.5591 -
0.0528 5280 1.5772 -
0.0529 5290 1.5927 -
0.053 5300 1.5812 0.5467
0.0531 5310 1.5703 -
0.0532 5320 1.5885 -
0.0533 5330 1.5879 -
0.0534 5340 1.5884 -
0.0535 5350 1.5692 -
0.0536 5360 1.6 -
0.0537 5370 1.5907 -
0.0538 5380 1.5728 -
0.0539 5390 1.5876 -
0.054 5400 1.5661 0.5485
0.0541 5410 1.5525 -
0.0542 5420 1.5855 -
0.0543 5430 1.5752 -
0.0544 5440 1.5785 -
0.0545 5450 1.5825 -
0.0546 5460 1.5828 -
0.0547 5470 1.5743 -
0.0548 5480 1.5873 -
0.0549 5490 1.5909 -
0.055 5500 1.5811 0.5538
0.0551 5510 1.5871 -
0.0552 5520 1.5569 -
0.0553 5530 1.5748 -
0.0554 5540 1.6094 -
0.0555 5550 1.5623 -
0.0556 5560 1.5732 -
0.0557 5570 1.5724 -
0.0558 5580 1.5731 -
0.0559 5590 1.5873 -
0.056 5600 1.5913 0.5512
0.0561 5610 1.5916 -
0.0562 5620 1.5684 -
0.0563 5630 1.5526 -
0.0564 5640 1.5699 -
0.0565 5650 1.5732 -
0.0566 5660 1.5797 -
0.0567 5670 1.5723 -
0.0568 5680 1.5719 -
0.0569 5690 1.572 -
0.057 5700 1.5484 0.5491
0.0571 5710 1.5927 -
0.0572 5720 1.5614 -
0.0573 5730 1.5723 -
0.0574 5740 1.5675 -
0.0575 5750 1.5577 -
0.0576 5760 1.5801 -
0.0577 5770 1.5685 -
0.0578 5780 1.5802 -
0.0579 5790 1.5707 -
0.058 5800 1.588 0.5508
0.0581 5810 1.5642 -
0.0582 5820 1.5649 -
0.0583 5830 1.5724 -
0.0584 5840 1.5701 -
0.0585 5850 1.5731 -
0.0586 5860 1.5625 -
0.0587 5870 1.5822 -
0.0588 5880 1.5668 -
0.0589 5890 1.5742 -
0.059 5900 1.5413 0.5498
0.0591 5910 1.5535 -
0.0592 5920 1.5653 -
0.0593 5930 1.5604 -
0.0594 5940 1.5905 -
0.0595 5950 1.5689 -
0.0596 5960 1.5578 -
0.0597 5970 1.5616 -
0.0598 5980 1.5945 -
0.0599 5990 1.5721 -
0.06 6000 1.5701 0.5506
0.0601 6010 1.5486 -
0.0602 6020 1.5581 -
0.0603 6030 1.5801 -
0.0604 6040 1.5684 -
0.0605 6050 1.583 -
0.0606 6060 1.5885 -
0.0607 6070 1.5647 -
0.0608 6080 1.5826 -
0.0609 6090 1.5858 -
0.061 6100 1.5618 0.5501
0.0611 6110 1.5538 -
0.0612 6120 1.5721 -
0.0613 6130 1.5919 -
0.0614 6140 1.5693 -
0.0615 6150 1.5578 -
0.0616 6160 1.552 -
0.0617 6170 1.5584 -
0.0618 6180 1.5455 -
0.0619 6190 1.5421 -
0.062 6200 1.5551 0.5552
0.0621 6210 1.5706 -
0.0622 6220 1.5732 -
0.0623 6230 1.5619 -
0.0624 6240 1.5732 -
0.0625 6250 1.5554 -
0.0626 6260 1.5734 -
0.0627 6270 1.5747 -
0.0628 6280 1.571 -
0.0629 6290 1.5616 -
0.063 6300 1.5575 0.5546
0.0631 6310 1.5586 -
0.0632 6320 1.5857 -
0.0633 6330 1.5593 -
0.0634 6340 1.556 -
0.0635 6350 1.5558 -
0.0636 6360 1.5799 -
0.0637 6370 1.5772 -
0.0638 6380 1.5477 -
0.0639 6390 1.5823 -
0.064 6400 1.5462 0.5497
0.0641 6410 1.57 -
0.0642 6420 1.5551 -
0.0643 6430 1.5687 -
0.0644 6440 1.5703 -
0.0645 6450 1.579 -
0.0646 6460 1.5426 -
0.0647 6470 1.5816 -
0.0648 6480 1.5712 -
0.0649 6490 1.5746 -
0.065 6500 1.5648 0.5503
0.0651 6510 1.5443 -
0.0652 6520 1.5499 -
0.0653 6530 1.5648 -
0.0654 6540 1.5585 -
0.0655 6550 1.5694 -
0.0656 6560 1.5579 -
0.0657 6570 1.5762 -
0.0658 6580 1.5494 -
0.0659 6590 1.5752 -
0.066 6600 1.554 0.5493
0.0661 6610 1.5469 -
0.0662 6620 1.5532 -
0.0663 6630 1.535 -
0.0664 6640 1.5146 -
0.0665 6650 1.5528 -
0.0666 6660 1.5626 -
0.0667 6670 1.5476 -
0.0668 6680 1.5594 -
0.0669 6690 1.5441 -
0.067 6700 1.562 0.5504
0.0671 6710 1.5468 -
0.0672 6720 1.5462 -
0.0673 6730 1.5711 -
0.0674 6740 1.5659 -
0.0675 6750 1.5413 -
0.0676 6760 1.5497 -
0.0677 6770 1.5477 -
0.0678 6780 1.548 -
0.0679 6790 1.5639 -
0.068 6800 1.5676 0.5500
0.0681 6810 1.5647 -
0.0682 6820 1.5663 -
0.0683 6830 1.5462 -
0.0684 6840 1.5675 -
0.0685 6850 1.5563 -
0.0686 6860 1.5754 -
0.0687 6870 1.5513 -
0.0688 6880 1.5629 -
0.0689 6890 1.5558 -
0.069 6900 1.5536 0.5476
0.0691 6910 1.5618 -
0.0692 6920 1.5559 -
0.0693 6930 1.5661 -
0.0694 6940 1.5554 -
0.0695 6950 1.5501 -
0.0696 6960 1.581 -
0.0697 6970 1.5303 -
0.0698 6980 1.5432 -
0.0699 6990 1.5486 -
0.07 7000 1.5517 0.5495
0.0701 7010 1.5765 -
0.0702 7020 1.5536 -
0.0703 7030 1.5425 -
0.0704 7040 1.5593 -
0.0705 7050 1.548 -
0.0706 7060 1.5501 -
0.0707 7070 1.5331 -
0.0708 7080 1.567 -
0.0709 7090 1.5906 -
0.071 7100 1.5586 0.5474
0.0711 7110 1.5468 -
0.0712 7120 1.5503 -
0.0713 7130 1.5514 -
0.0714 7140 1.5584 -
0.0715 7150 1.5471 -
0.0716 7160 1.5462 -
0.0717 7170 1.5498 -
0.0718 7180 1.558 -
0.0719 7190 1.5495 -
0.072 7200 1.5585 0.5486
0.0721 7210 1.536 -
0.0722 7220 1.5493 -
0.0723 7230 1.5442 -
0.0724 7240 1.5409 -
0.0725 7250 1.536 -
0.0726 7260 1.5674 -
0.0727 7270 1.5595 -
0.0728 7280 1.5184 -
0.0729 7290 1.5694 -
0.073 7300 1.5593 0.5513
0.0731 7310 1.5766 -
0.0732 7320 1.5562 -
0.0733 7330 1.5872 -
0.0734 7340 1.5705 -
0.0735 7350 1.5622 -
0.0736 7360 1.5577 -
0.0737 7370 1.5422 -
0.0738 7380 1.5643 -
0.0739 7390 1.5713 -
0.074 7400 1.533 0.5500
0.0741 7410 1.5204 -
0.0742 7420 1.5445 -
0.0743 7430 1.5503 -
0.0744 7440 1.5243 -
0.0745 7450 1.5465 -
0.0746 7460 1.559 -
0.0747 7470 1.5493 -
0.0748 7480 1.5589 -
0.0749 7490 1.5637 -
0.075 7500 1.5547 0.5484
0.0751 7510 1.5296 -
0.0752 7520 1.5466 -
0.0753 7530 1.5259 -
0.0754 7540 1.5496 -
0.0755 7550 1.5273 -
0.0756 7560 1.5606 -
0.0757 7570 1.5408 -
0.0758 7580 1.5529 -
0.0759 7590 1.5481 -
0.076 7600 1.5571 0.5541
0.0761 7610 1.5322 -
0.0762 7620 1.567 -
0.0763 7630 1.5387 -
0.0764 7640 1.554 -
0.0765 7650 1.5284 -
0.0766 7660 1.5255 -
0.0767 7670 1.5421 -
0.0768 7680 1.5538 -
0.0769 7690 1.5309 -
0.077 7700 1.549 0.5519
0.0771 7710 1.5373 -
0.0772 7720 1.5315 -
0.0773 7730 1.5345 -
0.0774 7740 1.56 -
0.0775 7750 1.5678 -
0.0776 7760 1.5653 -
0.0777 7770 1.521 -
0.0778 7780 1.5377 -
0.0779 7790 1.5518 -
0.078 7800 1.5454 0.5490
0.0781 7810 1.5227 -
0.0782 7820 1.5604 -
0.0783 7830 1.5283 -
0.0784 7840 1.5448 -
0.0785 7850 1.5116 -
0.0786 7860 1.5223 -
0.0787 7870 1.5497 -
0.0788 7880 1.5417 -
0.0789 7890 1.5358 -
0.079 7900 1.5504 0.5544
0.0791 7910 1.516 -
0.0792 7920 1.5422 -
0.0793 7930 1.537 -
0.0794 7940 1.5479 -
0.0795 7950 1.5579 -
0.0796 7960 1.5369 -
0.0797 7970 1.5321 -
0.0798 7980 1.522 -
0.0799 7990 1.5389 -
0.08 8000 1.5274 0.5526
0.0801 8010 1.5537 -
0.0802 8020 1.5397 -
0.0803 8030 1.5671 -
0.0804 8040 1.5349 -
0.0805 8050 1.5407 -
0.0806 8060 1.5563 -
0.0807 8070 1.5581 -
0.0808 8080 1.5423 -
0.0809 8090 1.5148 -
0.081 8100 1.5557 0.5544
0.0811 8110 1.5404 -
0.0812 8120 1.5368 -
0.0813 8130 1.5161 -
0.0814 8140 1.5595 -
0.0815 8150 1.5493 -
0.0816 8160 1.5312 -
0.0817 8170 1.5326 -
0.0818 8180 1.5424 -
0.0819 8190 1.5325 -
0.082 8200 1.5458 0.5561
0.0821 8210 1.5397 -
0.0822 8220 1.5438 -
0.0823 8230 1.5237 -
0.0824 8240 1.5396 -
0.0825 8250 1.5365 -
0.0826 8260 1.5609 -
0.0827 8270 1.533 -
0.0828 8280 1.5367 -
0.0829 8290 1.5316 -
0.083 8300 1.5386 0.5528
0.0831 8310 1.5259 -
0.0832 8320 1.5205 -
0.0833 8330 1.5561 -
0.0834 8340 1.533 -
0.0835 8350 1.5684 -
0.0836 8360 1.5475 -
0.0837 8370 1.5195 -
0.0838 8380 1.5388 -
0.0839 8390 1.564 -
0.084 8400 1.5572 0.5490
0.0841 8410 1.5567 -
0.0842 8420 1.5383 -
0.0843 8430 1.5645 -
0.0844 8440 1.5499 -
0.0845 8450 1.5267 -
0.0846 8460 1.5538 -
0.0847 8470 1.5635 -
0.0848 8480 1.5365 -
0.0849 8490 1.5374 -
0.085 8500 1.5453 0.5507
0.0851 8510 1.5155 -
0.0852 8520 1.5505 -
0.0853 8530 1.5381 -
0.0854 8540 1.5337 -
0.0855 8550 1.5475 -
0.0856 8560 1.5421 -
0.0857 8570 1.5318 -
0.0858 8580 1.5404 -
0.0859 8590 1.5227 -
0.086 8600 1.5323 0.5498
0.0861 8610 1.5245 -
0.0862 8620 1.5435 -
0.0863 8630 1.5516 -
0.0864 8640 1.5394 -
0.0865 8650 1.5141 -
0.0866 8660 1.5289 -
0.0867 8670 1.5191 -
0.0868 8680 1.5349 -
0.0869 8690 1.5507 -
0.087 8700 1.5337 0.5532
0.0871 8710 1.5471 -
0.0872 8720 1.5267 -
0.0873 8730 1.5308 -
0.0874 8740 1.5576 -
0.0875 8750 1.5424 -
0.0876 8760 1.5518 -
0.0877 8770 1.5316 -
0.0878 8780 1.5369 -
0.0879 8790 1.5412 -
0.088 8800 1.5407 0.5487
0.0881 8810 1.5257 -
0.0882 8820 1.5318 -
0.0883 8830 1.5214 -
0.0884 8840 1.5321 -
0.0885 8850 1.5282 -
0.0886 8860 1.5262 -
0.0887 8870 1.5545 -
0.0888 8880 1.5407 -
0.0889 8890 1.564 -
0.089 8900 1.5287 0.5518
0.0891 8910 1.5353 -
0.0892 8920 1.5155 -
0.0893 8930 1.5416 -
0.0894 8940 1.546 -
0.0895 8950 1.5349 -
0.0896 8960 1.5203 -
0.0897 8970 1.5282 -
0.0898 8980 1.5111 -
0.0899 8990 1.5121 -
0.09 9000 1.5209 0.5519
0.0901 9010 1.5333 -
0.0902 9020 1.5305 -
0.0903 9030 1.5397 -
0.0904 9040 1.523 -
0.0905 9050 1.5446 -
0.0906 9060 1.5378 -
0.0907 9070 1.533 -
0.0908 9080 1.5271 -
0.0909 9090 1.5201 -
0.091 9100 1.526 0.5524
0.0911 9110 1.5307 -
0.0912 9120 1.572 -
0.0913 9130 1.5016 -
0.0914 9140 1.526 -
0.0915 9150 1.5326 -
0.0916 9160 1.5189 -
0.0917 9170 1.5298 -
0.0918 9180 1.5211 -
0.0919 9190 1.5237 -
0.092 9200 1.5121 0.5497
0.0921 9210 1.4938 -
0.0922 9220 1.5094 -
0.0923 9230 1.5265 -
0.0924 9240 1.5278 -
0.0925 9250 1.5255 -
0.0926 9260 1.4975 -
0.0927 9270 1.5117 -
0.0928 9280 1.5378 -
0.0929 9290 1.5248 -
0.093 9300 1.5222 0.5531
0.0931 9310 1.5056 -
0.0932 9320 1.5361 -
0.0933 9330 1.5426 -
0.0934 9340 1.5023 -
0.0935 9350 1.5056 -
0.0936 9360 1.5058 -
0.0937 9370 1.5299 -
0.0938 9380 1.5178 -
0.0939 9390 1.532 -
0.094 9400 1.5248 0.5577
0.0941 9410 1.5374 -
0.0942 9420 1.518 -
0.0943 9430 1.5299 -
0.0944 9440 1.5432 -
0.0945 9450 1.5164 -
0.0946 9460 1.5252 -
0.0947 9470 1.5327 -
0.0948 9480 1.5519 -
0.0949 9490 1.5077 -
0.095 9500 1.5322 0.5550
0.0951 9510 1.5358 -
0.0952 9520 1.5362 -
0.0953 9530 1.5262 -
0.0954 9540 1.5286 -
0.0955 9550 1.5205 -
0.0956 9560 1.5372 -
0.0957 9570 1.5248 -
0.0958 9580 1.5457 -
0.0959 9590 1.5087 -
0.096 9600 1.531 0.5523
0.0961 9610 1.5057 -
0.0962 9620 1.5295 -
0.0963 9630 1.52 -
0.0964 9640 1.5131 -
0.0965 9650 1.5272 -
0.0966 9660 1.5161 -
0.0967 9670 1.5178 -
0.0968 9680 1.5452 -
0.0969 9690 1.5216 -
0.097 9700 1.5471 0.5541
0.0971 9710 1.5233 -
0.0972 9720 1.5388 -
0.0973 9730 1.5173 -
0.0974 9740 1.5223 -
0.0975 9750 1.5193 -
0.0976 9760 1.5143 -
0.0977 9770 1.5245 -
0.0978 9780 1.5368 -
0.0979 9790 1.5237 -
0.098 9800 1.5077 0.5545
0.0981 9810 1.5276 -
0.0982 9820 1.5117 -
0.0983 9830 1.5174 -
0.0984 9840 1.5359 -
0.0985 9850 1.5145 -
0.0986 9860 1.5355 -
0.0987 9870 1.4959 -
0.0988 9880 1.5106 -
0.0989 9890 1.5567 -
0.099 9900 1.5102 0.5508
0.0991 9910 1.5255 -
0.0992 9920 1.4878 -
0.0993 9930 1.522 -
0.0994 9940 1.5296 -
0.0995 9950 1.4935 -
0.0996 9960 1.5081 -
0.0997 9970 1.5163 -
0.0998 9980 1.5267 -
0.0999 9990 1.5361 -
0.1 10000 1.5067 0.5510

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.4.2
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
14
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including reasonwang/embedding-qwen3-1.7b-embedding_ac1_unicode_shuf

Papers for reasonwang/embedding-qwen3-1.7b-embedding_ac1_unicode_shuf

Evaluation results