Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from jiwonyou0420/MNLP_M2_document_encoder. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("jiwonyou0420/encoder-qa-finetuned-full")
# Run inference
sentences = [
"A physics student wants to know how to calculate the trajectory of a relativistic particle moving at a constant velocity. The particle's initial position is (0, 0) and its velocity is v = 0.8c, where c is the speed of light. The student also knows that the particle experiences no forces during its motion. What is the equation for the particle's trajectory as a function of time in terms of its initial position and velocity? How long does it take for the particle to reach a point where x = 2 meters?",
'where x ( t ) is the position of the particle at time t, x0 is the initial position ( 0 in this case ), and v is the velocity ( 0. 8c ). now, we want to find the time it takes for the particle to reach a point where x = 2 meters. to do this, we can set x ( t ) = 2 and solve for t : 2 = 0 + ( 0. 8c ) * t t = 2 / ( 0. 8c ) since c is the speed of light, which is approximately 3 x 10 ^ 8 meters per second, we can substitute this value',
'the ferroelectric behavior of a crystal is highly dependent on external factors such as pressure and temperature. ferroelectric materials exhibit spontaneous electric polarization that can be reversed by an external electric field. this behavior arises due to the displacement of ions within the crystal lattice, leading to the formation of electric dipoles. the ferroelectric properties of a crystal can be altered by changes in external pressure and temperature, which affect the crystal lattice and the stability of the polarized state. 1. effect of temperature : the ferroelectric behavior of a crystal is strongly influenced by temperature. as the temperature increases, the thermal energy causes',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
What is the effect of dark matter on the temperature anisotropies in the cosmic microwave background radiation? Provide a quantitative analysis to determine the magnitude of this effect and explain how it can be detected through observations of the CMB. |
planck satellite have provided high - resolution measurements of the cmb temperature anisotropies, allowing for precise constraints on the dark matter density and other cosmological parameters. in summary, dark matter affects the temperature anisotropies in the cmb by influencing the formation of large - scale structures and altering the power spectrum of the temperature fluctuations. the magnitude of this effect can be determined through a quantitative analysis of the power spectrum, and the presence of dark matter can be detected through observations of the cmb temperature anisotropies. |
1.0 |
How can the presence of monopoles affect the behavior of strings in string theory, and how can this be used to explain some unexplained phenomena in particle physics? |
spin states of individual electrons in quantum dots, paving the way for the development of scalable quantum computing architectures. 2. quantum dot arrays : researchers have successfully created arrays of quantum dots that can be used to perform quantum operations. these arrays can be used as a platform for implementing quantum error correction codes, which are essential for building fault - tolerant quantum computers. 3. coherent coupling : coherent coupling between quantum dots has been demonstrated, allowing for the transfer of quantum information between qubits. this is a crucial step towards building large - scale quantum computers, as it enables the creation of quantum gates and entangled states. 4. integration with |
0.0 |
A physics student is investigating the flow properties of a non-Newtonian fluid. The student has data on the fluid's shear stress vs. shear rate behavior, and wants to understand its rheological properties. Can the student determine if the fluid is shear-thinning, shear-thickening, or a Bingham plastic? If so, what is the fluid's behavior index or yield stress? |
yes, the student can determine if the fluid is shear - thinning, shear - thickening, or a bingham plastic by analyzing the data on the fluid ' s shear stress ( τ ) vs. shear rate ( γ ) behavior. 1. shear - thinning fluid : if the fluid ' s viscosity decreases with increasing shear rate, it is a shear - thinning fluid ( also known as pseudoplastic fluid ). in this case, the fluid ' s behavior index ( n ) will be less than 1. the relationship between shear stress and shear rate can be described by the power - law model : τ = k * |
1.0 |
CosineSimilarityLoss with these parameters:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 1multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss |
|---|---|---|
| 0.0923 | 500 | 0.0651 |
| 0.1845 | 1000 | 0.0457 |
| 0.2768 | 1500 | 0.0388 |
| 0.3690 | 2000 | 0.0374 |
| 0.4613 | 2500 | 0.0375 |
| 0.5535 | 3000 | 0.0342 |
| 0.6458 | 3500 | 0.0345 |
| 0.7380 | 4000 | 0.033 |
| 0.8303 | 4500 | 0.0313 |
| 0.9225 | 5000 | 0.0325 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
jiwonyou0420/MNLP_M2_document_encoder