ad / README.md
reboo13's picture
Add new SentenceTransformer model
ff76e1f verified
metadata
language:
  - en
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:106628
  - loss:MultipleNegativesRankingLoss
base_model: Qwen/Qwen3-Embedding-0.6B
widget:
  - source_sentence: ace-v
    sentences:
      - >-
        The floor plan was drafted at 1/4 inch scale where each quarter inch
        equals one foot.
      - Fingerprint examiners follow the ACE-V methodology for identification.
      - Most modern streaming services offer content in 1080p full HD quality.
  - source_sentence: adult learner
    sentences:
      - The adult learner brings valuable life experience to the classroom.
      - Accounts payable represents money owed to suppliers and vendors.
      - The inspection confirmed all above grade work met code requirements.
  - source_sentence: 1/4 inch scale
    sentences:
      - Precise adjustments require accurate action gauge readings.
      - The quality inspector identified adhesion failure in the sample.
      - >-
        The architect created drawings at 1/4 inch scale for the client
        presentation.
  - source_sentence: acrylic paint
    sentences:
      - Artists prefer acrylic paint for its fast drying time.
      - The company reported strong adjusted EBITDA growth this quarter.
      - The clinic specializes in adolescent health services.
  - source_sentence: adult learning
    sentences:
      - Solar developers calculate AEP, or annual energy production.
      - The course was designed using adult learning best practices.
      - >-
        The wizard cast Abi-Dalzim's horrid wilting, draining moisture from
        enemies.
datasets:
  - electroglyph/technical
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on Qwen/Qwen3-Embedding-0.6B

This is a sentence-transformers model finetuned from Qwen/Qwen3-Embedding-0.6B on the technical dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Qwen/Qwen3-Embedding-0.6B
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("reboo13/ad")
# Run inference
sentences = [
    'adult learning',
    'The course was designed using adult learning best practices.',
    'Solar developers calculate AEP, or annual energy production.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6213, 0.1227],
#         [0.6213, 1.0000, 0.1474],
#         [0.1227, 0.1474, 1.0000]])

Training Details

Training Dataset

technical

  • Dataset: technical at 05eeb90
  • Size: 106,628 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 2 tokens
    • mean: 3.83 tokens
    • max: 11 tokens
    • min: 8 tokens
    • mean: 12.66 tokens
    • max: 23 tokens
  • Samples:
    anchor positive
    .308 The .308 Winchester is a popular rifle cartridge used for hunting and target shooting.
    .308 Many precision rifles are chambered in .308 for its excellent long-range accuracy.
    .308 The sniper selected a .308 caliber round for the mission.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 256
  • learning_rate: 3e-05
  • max_steps: 60
  • lr_scheduler_type: constant_with_warmup
  • warmup_ratio: 0.03
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 60
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.03
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0024 1 2.9285
0.0048 2 2.9415
0.0072 3 2.7433
0.0096 4 2.8367
0.0120 5 2.7583
0.0144 6 2.8774
0.0168 7 2.7791
0.0192 8 2.5914
0.0216 9 2.5369
0.0240 10 2.5583
0.0264 11 2.428
0.0288 12 2.2281
0.0312 13 2.3207
0.0336 14 2.3152
0.0360 15 2.3222
0.0384 16 1.9328
0.0408 17 2.0254
0.0432 18 2.2076
0.0456 19 1.9551
0.0480 20 2.0753
0.0504 21 1.9028
0.0528 22 1.8977
0.0552 23 1.8852
0.0576 24 1.8288
0.0600 25 1.7363
0.0624 26 1.8455
0.0647 27 1.7129
0.0671 28 1.9365
0.0695 29 2.0386
0.0719 30 1.8644
0.0743 31 1.481
0.0767 32 1.8281
0.0791 33 1.5593
0.0815 34 1.7088
0.0839 35 1.7356
0.0863 36 1.6223
0.0887 37 1.6218
0.0911 38 1.4948
0.0935 39 1.6253
0.0959 40 1.553
0.0983 41 1.565
0.1007 42 1.6852
0.1031 43 1.4419
0.1055 44 1.4839
0.1079 45 1.4249
0.1103 46 1.4301
0.1127 47 1.5504
0.1151 48 1.4154
0.1175 49 1.3868
0.1199 50 1.601
0.1223 51 1.468
0.1247 52 1.4715
0.1271 53 1.6019
0.1295 54 1.4216
0.1319 55 1.3206
0.1343 56 1.4081
0.1367 57 1.2969
0.1391 58 1.5933
0.1415 59 1.4106
0.1439 60 1.7639

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.0
  • Transformers: 4.56.2
  • PyTorch: 2.9.0+cu126
  • Accelerate: 1.12.0
  • Datasets: 4.3.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}