SentenceTransformer based on sanganaka/bge-m3-sanskritFT

This is a sentence-transformers model finetuned from sanganaka/bge-m3-sanskritFT. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sanganaka/bge-m3-sanskritFT
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'embedding_dimension': 1024, 'pooling_mode': 'cls', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "I've recently suffered a great loss, and I feel abandoned and questioning why such pain exists if there's a benevolent force. Where is the solace in such suffering?",
    'paritrāṇāya sādhūnāṃ vināśāya ca duṣkṛtām | dharma-saṃsthāpanārthāya saṃbhavāmi yuge yuge ||8||',
    'tataḥ śvetair hayair yukte mahati syandane sthitau | mādhavaḥ pāṇḍavaś caiva divyau śaṅkhau pradadhmatuḥ ||14||',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 1.0000, 1.0000],
#         [1.0000, 1.0000, 1.0000],
#         [1.0000, 1.0000, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,858 training samples
  • Columns: sentence_0, sentence_1, sentence_2, sentence_3, sentence_4, sentence_5, sentence_6, sentence_7, and sentence_8
  • Approximate statistics based on the first 100 samples:
    sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5 sentence_6 sentence_7 sentence_8
    type string string string string string string string string string
    modality text text text text text text text text text
    details
    • min: 25 tokens
    • mean: 45.6 tokens
    • max: 76 tokens
    • min: 37 tokens
    • mean: 66.34 tokens
    • max: 256 tokens
    • min: 41 tokens
    • mean: 64.54 tokens
    • max: 242 tokens
    • min: 34 tokens
    • mean: 58.37 tokens
    • max: 242 tokens
    • min: 34 tokens
    • mean: 60.14 tokens
    • max: 242 tokens
    • min: 34 tokens
    • mean: 63.36 tokens
    • max: 165 tokens
    • min: 39 tokens
    • mean: 63.16 tokens
    • max: 256 tokens
    • min: 34 tokens
    • mean: 67.22 tokens
    • max: 242 tokens
    • min: 37 tokens
    • mean: 60.1 tokens
    • max: 242 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5 sentence_6 sentence_7 sentence_8
    I'm constantly anxious about what the future holds—will I succeed, will my loved ones be okay, will things fall apart? I can't seem to just live in the present. ā brahma-bhuvanāl lokāḥ punar-āvartino 'rjuna | mām upetya tu kaunteya punar-janma na vidyate ||16|| manuṣyāṇāṃ sahasreṣu kaś cid yatati siddhaye | yatatām api siddhānāṃ kaś cin māṃ vetti tattvataḥ ||3|| tasmāt sarveṣu kāleṣu mām anusmara yudhya ca | mayy arpitamanobuddhir mām evaiṣyasy asaṃśayaḥ ||7|| samaḥ śatrau ca mitre ca tathā mānāpamānayoḥ | śītoṣṇa-sukha-duḥkheṣu samaḥ saṅga-vivarjitaḥ ||18|| tulya-nindā-stutir maunī saṃtuṣṭo yena kenacit | aniketaḥ sthira-matir bhaktimān me priyo naraḥ ||19|| na hi prapaśyāmi mamāpanudyād yac chokam ucchoṣaṇam indriyāṇām | avāpya bhūmāv asapatnam ṛddhaṃ rājyaṃ surāṇām api cādhipatyam ||8|| anudvega-karaṃ vākyaṃ satyaṃ priya-hitaṃ ca yat | svādhyāyābhyasanaṃ caiva vāṅ-mayaṃ tapa ucyate ||15|| traividyā māṃ somapāḥ pūta-pāpā yajñair iṣṭvā svar-gatiṃ prārthayante | te puṇyam āsādya surendra-lokam aśnanti divyān divi deva-bhogān ||20|| āyuḥ-sattva-balārogya-sukha-prīti-vivardhanāḥ | rasyāḥ snigdhāḥ sthirā hṛdyā āhārāḥ sāttvika-priyāḥ ||8||
    I feel so much anger towards certain people who have wronged me or situations that have gone completely against my desires. It feels like they or life itself is conspiring against me. Why does this happen, and how can I let go of this rage? na kartṛtvaṃ na karmāṇi lokasya sṛjati prabhuḥ | na karma-phala-saṃyogaṃ svabhāvas tu pravartate ||14|| ye yathā māṃ prapadyante tāṃs tathaiva bhajāmy aham | mama vartmānuvartante manuṣyāḥ pārtha sarvaśaḥ ||11|| kleśo 'dhikataras teṣām avyaktāsakta-cetasām | avyaktā hi gatir duḥkhaṃ dehavadbhir avāpyate ||5|| manuṣyāṇāṃ sahasreṣu kaś cid yatati siddhaye | yatatām api siddhānāṃ kaś cin māṃ vetti tattvataḥ ||3|| yadā yadā hi dharmasya glānir bhavati bhārata | abhyutthānam adharmasya tadātmānaṃ sṛjāmy aham ||7|| mac-cittā mad-gata-prāṇā bodhayantaḥ parasparam | kathayantaś ca māṃ nityaṃ tuṣyanti ca ramanti ca ||9|| aśāstra-vihitaṃ ghoraṃ tapyante ye tapo janāḥ | dambhāhaṃkāra-saṃyuktāḥ kāma-rāga-balānvitāḥ ||5|| karśayantaḥ śarīra-sthaṃ bhūta-grāmam acetasaḥ | māṃ caivāntaḥ-śarīra-sthaṃ tān viddhy āsura-niścayān ||6|| vedāvināśinaṃ nityaṃ ya enam ajam avyayam | kathaṃ sa puruṣaḥ pārtha kaṃ ghātayati hanti kam ||21||
    I've lost someone incredibly dear to me, and the pain is unbearable. I feel like a part of me is gone forever. How can I heal and find meaning amidst this sorrow? ye tu sarvāṇi karmāṇi mayi saṃnyasya matparaḥ |
    ananyenaiva yogena māṃ dhyāyanta upāsate |
    āyuḥ-sattva-balārogya-sukha-prīti-vivardhanāḥ | rasyāḥ snigdhāḥ sthirā hṛdyā āhārāḥ sāttvika-priyāḥ ||8|| samaḥ śatrau ca mitre ca tathā mānāpamānayoḥ | śītoṣṇa-sukha-duḥkheṣu samaḥ saṅga-vivarjitaḥ ||18|| tulya-nindā-stutir maunī saṃtuṣṭo yena kenacit | aniketaḥ sthira-matir bhaktimān me priyo naraḥ ||19|| avyakto 'kṣara ity uktas tam āhuḥ paramāṃ gatim | yaṃ prāpya na nivartante tad dhāma paramaṃ mama ||21|| hṛṣīkeśaṃ tadā vākyam idam āha mahīpate | senayor ubhayor madhye rathaṃ sthāpaya me 'cyuta ||21|| yāvad etān nirīkṣe 'haṃ yoddhukāmān avasthitān | kair mayā saha yoddhavyam asmin raṇasamudyame ||22|| yotsyamānān avekṣe 'haṃ ya ete 'tra samāgatāḥ | dhārtarāṣṭrasya durbuddher yuddhe priyacikīrṣavaḥ ||23|| yasmān nodvijate loko lokān nodvijate ca yaḥ | harṣāmarṣa-bhayodvegair mukto yaḥ sa ca me priyaḥ ||15|| mām upetya punar janma duḥkhālayam aśāśvatam | nāpnuvanti mahātmānaḥ saṃsiddhiṃ paramāṃ gatāḥ ||15|| jitātmanaḥ praśāntasya paramātmā samāhitaḥ | śītoṣṇa-sukha-duḥkheṣu tathā mānāpamānayoḥ ||7||
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false,
        "directions": [
            "query_to_doc"
        ],
        "partition_mode": "joint",
        "hardness_mode": null,
        "hardness_strength": 0.0
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 4
  • num_train_epochs: 2
  • per_device_eval_batch_size: 4
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • per_device_train_batch_size: 4
  • num_train_epochs: 2
  • max_steps: -1
  • learning_rate: 5e-05
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_steps: 0
  • optim: adamw_torch_fused
  • optim_args: None
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • optim_target_modules: None
  • gradient_accumulation_steps: 1
  • average_tokens_across_devices: True
  • max_grad_norm: 1
  • label_smoothing_factor: 0.0
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • use_liger_kernel: False
  • liger_kernel_config: None
  • use_cache: False
  • neftune_noise_alpha: None
  • torch_empty_cache_steps: None
  • auto_find_batch_size: False
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • include_num_input_tokens_seen: no
  • log_level: passive
  • log_level_replica: warning
  • disable_tqdm: False
  • project: huggingface
  • trackio_space_id: None
  • trackio_bucket_id: None
  • trackio_static_space_id: None
  • per_device_eval_batch_size: 4
  • prediction_loss_only: True
  • eval_on_start: False
  • eval_do_concat_batches: True
  • eval_use_gather_object: False
  • eval_accumulation_steps: None
  • include_for_metrics: []
  • batch_eval_metrics: False
  • save_only_model: False
  • save_on_each_node: False
  • enable_jit_checkpoint: False
  • push_to_hub: False
  • hub_private_repo: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_always_push: False
  • hub_revision: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • restore_callback_states_from_checkpoint: False
  • full_determinism: False
  • seed: 42
  • data_seed: None
  • use_cpu: False
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • dataloader_prefetch_factor: None
  • remove_unused_columns: True
  • label_names: None
  • train_sampling_strategy: random
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • ddp_static_graph: None
  • ddp_backend: None
  • ddp_timeout: 1800
  • fsdp: None
  • fsdp_config: None
  • deepspeed: None
  • debug: []
  • skip_memory_metrics: True
  • do_predict: False
  • resume_from_checkpoint: None
  • warmup_ratio: None
  • local_rank: -1
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.4115 500 3.6841
0.8230 1000 3.5072
1.2346 1500 3.4757
1.6461 2000 3.4740

Training Time

  • Training: 25.7 minutes

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 5.5.1
  • Transformers: 5.12.1
  • PyTorch: 2.12.0+cu130
  • Accelerate: 1.14.0
  • Datasets: 5.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sathvik0101/srag-biencoder-hn

Base model

BAAI/bge-m3
Finetuned
(2)
this model

Papers for Sathvik0101/srag-biencoder-hn