SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("stel42/bs-faq-mpnet-basev2-finetuned")
# Run inference
queries = [
    "RSUSW itu singkatan dari apa?",
]
documents = [
    'RSU Syubbanul Wathon adalah kepanjangan dari RSUSW',
    'Lokasinya di Jl. Magelang - Kopeng km 08 Tegalrejo Kabupaten Magelang 56192',
    'Nama lain Siloam Hospitals Bangka Belitung adalah SHBB atau Siloam Pangkalan Baru',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.8722, 0.2129, 0.0839]])

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,718 training samples
  • Columns: query, answer_positive, and answer_negative
  • Approximate statistics based on the first 1000 samples:
    query answer_positive answer_negative
    type string string string
    details
    • min: 6 tokens
    • mean: 14.47 tokens
    • max: 31 tokens
    • min: 10 tokens
    • mean: 48.05 tokens
    • max: 384 tokens
    • min: 10 tokens
    • mean: 44.12 tokens
    • max: 216 tokens
  • Samples:
    query answer_positive answer_negative
    RS Jantung Diagram Cinere punya sebutan lain? Nama lain RS Jantung Diagram Cinere adalah SHCN atau Siloam Cinere Lokasinya di Jl. Kompol Maksum No.296, Peterongan, Kec. Semarang Sel., Kota Semarang, Jawa Tengah 50242
    aplikasi mysiloam itu fungsinya apa Aplikasi MySiloam adalah aplikasi mobile untuk membantu pengguna menikmati kemudahan layanan kesehatan yang disediakan oleh Rumah Sakit Siloam. Beberapa fitur yang sudah membantu banyak orang adalah, pembuatan janji temu dokter, pemesanan tes dan layanan kesehatan, emergency, dan ambulance call. Lokasinya di JL. Mampang Prapatan XVI, Kel. Duren Tiga, Kec. Pancoran, Kota Adm. Jakarta Selatan, Prov. DKI Jakarta
    Mau ke Siloam Surabaya, alamatnya apa? Lokasinya di Jl. Raya Gubeng No.70, Gubeng, Kec. Gubeng, Kota SBY, Jawa Timur 60281 Ya, Anda dapat menanyakan kepada staff registrasi kami untuk mendapatkan perkiraan biaya sebelum perawatan Anda. Harap dicatat bahwa ada kemungkinan bahwa perkiraan biaya yang diberikan berbeda dari biaya akhir.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 302 evaluation samples
  • Columns: query, answer_positive, and answer_negative
  • Approximate statistics based on the first 302 samples:
    query answer_positive answer_negative
    type string string string
    details
    • min: 6 tokens
    • mean: 14.56 tokens
    • max: 26 tokens
    • min: 10 tokens
    • mean: 48.08 tokens
    • max: 260 tokens
    • min: 17 tokens
    • mean: 41.42 tokens
    • max: 111 tokens
  • Samples:
    query answer_positive answer_negative
    SHPL ada di jalan apa? Lokasinya di Jl. POM IX, Lorok Pakjo, Kec. Ilir Bar. I, Kota Palembang, Sumatera Selatan 30137 Pendaftaran MCU / Medical Check Up melalui Pembayaran pada pendaftaran MCU / Medical Check Up dapat dilakukan dengan transfer ke rekening Bank, Credit Card, dan pembayaran cash pada saat melakukan registrasi.
    Dimana Siloam TB Simatupang? Lokasinya di Jl. RA. Kartini No. 8, Cilandak Jakarta Selatan Lama waktu yang dibutuhkan untuk menjalankan MCU / Medical Check Up sekitar 2-6 jam tergantung dari jenis paket pemeriksaan yang dijalankan.
    siloam putera bahagia disebut apa lagi? Nama lain Siloam Hospitals Putera Bahagia adalah SHCB atau Siloam Harjamukti Lokasinya di Jl. Kelapa Dua Raya No.1001, Klp. Dua, Kec. Klp. Dua, Kabupaten Tangerang, Banten 15810
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss faq-test_cosine_accuracy
0.2941 100 0.9634 - -
0.5882 200 0.2638 0.1916 -
0.8824 300 0.1286 - -
1.1765 400 0.0794 0.0689 -
1.4706 500 0.0314 - -
1.7647 600 0.0236 0.0228 -
2.0588 700 0.0178 - -
2.3529 800 0.0127 0.0207 -
2.6471 900 0.0112 - -
2.9412 1000 0.0061 0.0268 -
0.5882 100 0.0189 - -
1.1765 200 0.0173 0.0289 -
1.7647 300 0.0235 - -
2.3529 400 0.0148 0.0212 -
2.9412 500 0.0111 - -
3.5294 600 0.0099 0.0209 -
4.1176 700 0.0093 - -
4.7059 800 0.0102 0.0247 -
5.2941 900 0.0076 - -
5.8824 1000 0.0038 0.0148 -
6.4706 1100 0.0073 - -
7.0588 1200 0.0031 0.0262 -
7.6471 1300 0.0045 - -
8.2353 1400 0.0048 0.0235 -
8.8235 1500 0.0042 - -
9.4118 1600 0.0052 0.0216 -
10.0 1700 0.0086 - -
-1 -1 - - 1.0

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for stel42/bs-faq-mpnet-basev2-finetuned

Finetuned
(350)
this model

Papers for stel42/bs-faq-mpnet-basev2-finetuned

Evaluation results