Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use lfsolis/ArxvBert-ST_v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("lfsolis/ArxvBert-ST_v2")
sentences = [
"Entanglement increase from local interactions with\n not-completely-positive maps",
" Simple examples are constructed that show the entanglement of two qubits\nbeing both increased and decreased by interactions on just one of them. One of\nthe two qubits interacts with a third qubit, a control, that is never entangled\nor correlated with either of the two entangled qubits and is never entangled,\nbut becomes correlated, with the system of those two qubits. The two entangled\nqubits do not interact, but their state can change from maximally entangled to\nseparable or from separable to maximally entangled. Similar changes for the two\nqubits are made with a swap operation between one of the qubits and a control;\nthen there are compensating changes of entanglement that involve the control.\nWhen the entanglement increases, the map that describes the change of the state\nof the two entangled qubits is not completely positive. Combination of two\nindependent interactions that individually give exponential decay of the\nentanglement can cause the entanglement to not decay exponentially but,\ninstead, go to zero at a finite time.\n",
" Many extra-solar planets discovered over the past decade are gas giants in\ntight orbits around their host stars. Due to the difficulties of forming these\n`hot Jupiters' in situ, they are generally assumed to have migrated to their\npresent orbits through interactions with their nascent discs. In this paper, we\npresent a systematic study of giant planet migration in power law discs. We\nfind that the planetary migration rate is proportional to the disc surface\ndensity. This is inconsistent with the assumption that the migration rate is\nsimply the viscous drift speed of the disc. However, this result can be\nobtained by balancing the angular momentum of the planet with the viscous\ntorque in the disc. We have verified that this result is not affected by\nadjusting the resolution of the grid, the smoothing length used, or the time at\nwhich the planet is released to migrate.\n",
" We investigate the evolution of binary fractions in star clusters using\nN-body models of up to 100000 stars. Primordial binary frequencies in these\nmodels range from 5% to 50%. Simulations are performed with the NBODY4 code and\ninclude a full mass spectrum of stars, stellar evolution, binary evolution and\nthe tidal field of the Galaxy. We find that the overall binary fraction of a\ncluster almost always remains close to the primordial value, except at late\ntimes when a cluster is near dissolution. A critical exception occurs in the\ncentral regions where we observe a marked increase in binary fraction with time\n-- a simulation starting with 100000 stars and 5% binaries reached a core\nbinary frequency as high as 40% at the end of the core-collapse phase\n(occurring at 16 Gyr with ~20000 stars remaining). Binaries are destroyed in\nthe core by a variety of processes as a cluster evolves, but the combination of\nmass-segregation and creation of new binaries in exchange interactions produces\nthe observed increase in relative number. We also find that binaries are cycled\ninto and out of cluster cores in a manner that is analogous to convection in\nstars. For models of 100000 stars we show that the evolution of the core-radius\nup to the end of the initial phase of core-collapse is not affected by the\nexact value of the primordial binary frequency (for frequencies of 10% or\nless). We discuss the ramifications of our results for the likely primordial\nbinary content of globular clusters.\n"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from lufercho/ArxBert-MLM. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Approximation of the distribution of a stationary Markov process with\n application to option pricing',
" We build a sequence of empirical measures on the space D(R_+,R^d) of\nR^d-valued c\\`adl\\`ag functions on R_+ in order to approximate the law of a\nstationary R^d-valued Markov and Feller process (X_t). We obtain some general\nresults of convergence of this sequence. Then, we apply them to Brownian\ndiffusions and solutions to L\\'evy driven SDE's under some Lyapunov-type\nstability assumptions. As a numerical application of this work, we show that\nthis procedure gives an efficient way of option pricing in stochastic\nvolatility models.\n",
" We provide a new estimate of the local supermassive black hole mass function\nusing (i) the empirical relation between supermassive black hole mass and the\nSersic index of the host spheroidal stellar system and (ii) the measured\n(spheroid) Sersic indices drawn from 10k galaxies in the Millennium Galaxy\nCatalogue. The observational simplicity of our approach, and the direct\nmeasurements of the black hole predictor quantity, i.e. the Sersic index, for\nboth elliptical galaxies and the bulges of disc galaxies makes it\nstraightforward to estimate accurate black hole masses in early- and late-type\ngalaxies alike. We have parameterised the supermassive black hole mass function\nwith a Schechter function and find, at the low-mass end, a logarithmic slope\n(1+alpha) of ~0.7 for the full galaxy sample and ~1.0 for the early-type galaxy\nsample. Considering spheroidal stellar systems brighter than M_B = -18 mag, and\nintegrating down to black hole masses of 10^6 M_sun, we find that the local\nmass density of supermassive black holes in early-type galaxies rho_{bh,\nearly-type} = (3.5+/-1.2) x 10^5 h^3_{70} M_sun Mpc^{-3}, and in late-type\ngalaxies rho_{bh, late-type} = (1.0+/-0.5) x 10^5 h^3_{70} M_sun Mpc^{-3}. The\nuncertainties are derived from Monte Carlo simulations which include\nuncertainties in the M_bh-n relation, the catalogue of Sersic indices, the\ngalaxy weights and Malmquist bias. The combined, cosmological, supermassive\nblack hole mass density is thus Omega_{bh, total} = (3.2+/-1.2) x 10^{-6} h_70.\nThat is, using a new and independent method, we conclude that (0.007+/-0.003)\nh^3_{70} per cent of the universe's baryons are presently locked up in\nsupermassive black holes at the centres of galaxies.\n",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Lifetime of doubly charmed baryons |
In this work, we evaluate the lifetimes of the doubly charmed baryons |
Broadening the Higgs Boson with Right-Handed Neutrinos and a Higher |
The existence of certain TeV suppressed higher-dimension operators may open |
Infrared Evolution Equations: Method and Applications |
It is a brief review on composing and solving Infrared Evolution Equations. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
per_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 10multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}