metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:5489
- loss:MultipleNegativesRankingLoss
base_model: zacbrld/MNLP_M2_document_encoder
widget:
- source_sentence: >-
Military activity affects the physical geology. This was first noted
through the intensive shelling on the Western Front during World War I,
which caused the shattering of the bedrock and changed the rocks'
permeability. New minerals, rocks, and land-forms are also a byproduct of
nuclear testing.
sentences:
- >-
Silicon can form sigma bonds to other silicon atoms (and disilane is the
parent of this class of compounds). However, it is difficult to prepare
and isolate SinH2n+2 (analogous to the saturated alkane hydrocarbons)
with n greater than about 8, as their thermal stability decreases with
increases in the number of silicon atoms. Silanes higher in molecular
weight than disilane decompose to polymeric polysilicon hydride and
hydrogen. But with a suitable pair of organic substituents in place of
hydrogen on each silicon it is possible to prepare polysilanes
(sometimes, erroneously called polysilenes) that are analogues of
alkanes. These long chain compounds have surprising electronic
properties - high electrical conductivity, for example - arising from
sigma delocalization of the electrons in the chain.
Even silicon–silicon pi bonds are possible. However, these bonds are
less stable than the carbon analogues. Disilane and longer silanes are
quite reactive compared to alkanes. Disilene and disilynes are quite
rare, unlike alkenes and alkynes. Examples of disilynes, long thought to
be too unstable to be isolated were reported in 2004.
- >-
The increasing sophistication of brain-reading technologies has led many
to investigate their potential applications for lie detection. Legally
required brain scans arguably violate “the guarantee against
self-incrimination” because they differ from acceptable forms of bodily
evidence, such as fingerprints or blood samples, in an important way:
they are not simply physical, hard evidence, but evidence that is
intimately linked to the defendant's mind. Under US law, brain-scanning
technologies might also raise implications for the Fourth Amendment,
calling into question whether they constitute an unreasonable search and
seizure.
- >-
Military activity affects the physical geology. This was first noted
through the intensive shelling on the Western Front during World War I,
which caused the shattering of the bedrock and changed the rocks'
permeability. New minerals, rocks, and land-forms are also a byproduct
of nuclear testing.
- source_sentence: >-
Right after a bombing in Moscow on September 6, 1999, several anti-nuclear
activists were detained under suspicion. Vladimir Slivyak was one of the
three arrested under suspicion. He was an activist in the anti-nuclear
movement and a Voronezh action camp organizer. After the bombing Slivyak
was pushed into a car by several men who claimed to be Moscow police. The
police interrogated and threatened Slivyak for around ninety minutes
before letting him go. The Moscow police thought environmentalists from
the anti-nuclear movement were associated with the bombing since an
earlier bombing occurred on August 31 at Manezh Palace in Moscow . After
the incident, on August 31, several more bombings occurred which agitated
many people, leading to the racially profiled arrest of dark-skinned
Muscovites and visitors to the Russian capital.
sentences:
- >-
The technique works backwards from the target to identify a precursor
molecule and an enzyme that converts it into the target, and then a
second precursor that can produce the first and so on until a simple,
inexpensive molecule becomes the beginning of the series. For each
precursor, the enzyme is evolved using induced mutations and natural
selection to produce a more productive version. The evolutionary process
can be repeated over multiple generations until acceptable productivity
is achieved. The process does not require high temperature, high
pressure, the use of exotic catalysts or other elements that can
increase costs. The enzyme "optimizations" that increase the production
of one precursor from another are cumulative in that the same precursor
productivity improvements can potentially be leveraged across multiple
target molecules.
- >-
Right after a bombing in Moscow on September 6, 1999, several
anti-nuclear activists were detained under suspicion. Vladimir Slivyak
was one of the three arrested under suspicion. He was an activist in the
anti-nuclear movement and a Voronezh action camp organizer. After the
bombing Slivyak was pushed into a car by several men who claimed to be
Moscow police. The police interrogated and threatened Slivyak for around
ninety minutes before letting him go. The Moscow police thought
environmentalists from the anti-nuclear movement were associated with
the bombing since an earlier bombing occurred on August 31 at Manezh
Palace in Moscow . After the incident, on August 31, several more
bombings occurred which agitated many people, leading to the racially
profiled arrest of dark-skinned Muscovites and visitors to the Russian
capital.
- >-
One of the main sources of information about the Earth's composition
comes from understanding the relationship between peridotite and basalt
melting. Peridotite makes up most of Earth's mantle. Basalt, which is
highly concentrated in the Earth's oceanic crust, is formed when magma
reaches the Earth's surface and cools down at a very fast rate. When
magma cools, different minerals crystallize at different times depending
on the cooling temperature of that respective mineral. This ultimately
changes the chemical composition of the melt as different minerals begin
to crystallize. Fractional crystallization of elements in basaltic
liquids has also been studied to observe the composition of lava in the
upper mantle. This concept can be applied by scientists to give insight
on the evolution of Earth's mantle and how concentrations of lithophile
trace elements have varied over the last 3.5 billion years.
- source_sentence: >-
The group designs numerous structural concepts such as frameworks and
floors like Dalle O'Portune and D-Dalle.
The timber design office of excellence is an entity specializing in the
design and optimization of wood construction projects. It stands out for
its ability to meet the highest demands in terms of performance,
durability and aesthetics, and is thus recognized for its contribution to
the realization of ambitious projects in the field of timber construction.
sentences:
- >-
The group designs numerous structural concepts such as frameworks and
floors like Dalle O'Portune and D-Dalle.
The timber design office of excellence is an entity specializing in the
design and optimization of wood construction projects. It stands out for
its ability to meet the highest demands in terms of performance,
durability and aesthetics, and is thus recognized for its contribution
to the realization of ambitious projects in the field of timber
construction.
- >-
In waterways, the term bridge strike may be used when a water vessel
collides with a bridge. This may include a collision to the bridge span
or a collision to the bridge support structure such as a pier. Bridge
protection systems are used to mitigate the effects of a ship strike.
In 2014, the United States Coast Guard published statistics that it
investigated 205 bridge strikes in the eleven years prior to the
publication. All of those collisions involved involved a fixed, swing,
lift or draw bridge. That number was 1.2% of all vessel collision
incidents investigated by the Coast Guard. The primary causal factor was
the lack of accurate air draft data, the distance between water surface
to the top most part of the vessel.
- >-
Post, Stephen Garrard. Encyclopedia of bioethics. Third edition.
Macmillan Reference USA, 2003. ISBN 0028657748. ISSN 0950-4125;
DOI:10.1108/09504120510573477. (5-Volume Set; 3062 pages).
Reich, Warren Thomas Encyclopedia of Bioethics. First edition. New
York: Free Press, 1978. ISBN 0029261805. ISBN 978-0029261804.
(4-Volume Set; 1933 pages)
Reich, Warren Thomas Encyclopedia of Bioethics. Second edition. New
York: Free Press, 1982. (5-Volume Set; 2950 pages)
Reich, Warren Thomas Encyclopedia of Bioethics. Third edition. New
York: Simon & Schuster Macmillan, 1995; London: Simon and Schuster and
Prentice Hall International, c1995. Rev. ed. (5-Volume Set; 2950 pages;
464 articles) ISBN 0028973550. ISBN 978-0028973555.
- source_sentence: >-
Regression is used to make predictions based on the retrieved data through
statistical trends and statistical modeling. Different uses of this
technique are used for fetching Photometric redshifts and measurements of
physical parameters of stars. The approaches are listed below:
Artificial neural network (ANN)
Support vector regression (SVR)
Decision tree
Random forest
k-nearest neighbors regression
Kernel regression
Principal component regression (PCR)
Gaussian process
Least squared regression (LSR)
Partial least squares regression
sentences:
- >-
Regression is used to make predictions based on the retrieved data
through statistical trends and statistical modeling. Different uses of
this technique are used for fetching Photometric redshifts and
measurements of physical parameters of stars. The approaches are listed
below:
Artificial neural network (ANN)
Support vector regression (SVR)
Decision tree
Random forest
k-nearest neighbors regression
Kernel regression
Principal component regression (PCR)
Gaussian process
Least squared regression (LSR)
Partial least squares regression
- >-
Clandestine chemistry is not limited to drugs; it is also associated
with explosives, and other illegal chemicals. Of the explosives
manufactured illegally, nitroglycerin and acetone peroxide are easiest
to produce due to the ease with which the precursors can be acquired.
Uncle Fester is a writer who commonly writes about different aspects of
clandestine chemistry. Secrets of Methamphetamine Manufacture is among
his most popular books, and is considered required reading for DEA
agents. More of his books deal with other aspects of clandestine
chemistry, including explosives, and poisons. Fester is, however,
considered by many to be a faulty and unreliable source for information
in regard to the clandestine manufacture of chemicals.
- >-
A novel input representation has been developed consisting of a
combination of sparse encoding, Blosum encoding, and input derived from
hidden Markov models. this method predicts T-cell epitopes for the
genome of hepatitis C virus and discuss possible applications of the
prediction method to guide the process of rational vaccine design.
- source_sentence: >-
Burray and The Barriers
Undiscovered Scotland: The Churchill Barriers
Our Past History: The Churchill Barriers Archived 17 December 2006 at the
Wayback Machine
Okneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback
Machine
sentences:
- |-
For a neuron, in the limit of
b
=
0
{\displaystyle b=0}
, the map becomes 1D, since
y
{\displaystyle y}
converges to a constant. If the parameter
b
{\displaystyle b}
is scanned in a range, different orbits will be seen, some periodic, others chaotic, that appear between two fixed points, one at
x
=
1
{\displaystyle x=1}
;
y
=
1
{\displaystyle y=1}
and the other close to the value of
k
{\displaystyle k}
(which would be the regime excitable).
== References ==
- >-
Cerebellar Purkinje neurons have been proposed to have two distinct
bursting modes: dendritically driven, by dendritic Ca2+ spikes, and
somatically driven, wherein the persistent Na+ current is the burst
initiator and the SK K+ current is the burst terminator. Purkinje
neurons may utilise these bursting forms in information coding to the
deep cerebellar nuclei.
- >-
Burray and The Barriers
Undiscovered Scotland: The Churchill Barriers
Our Past History: The Churchill Barriers Archived 17 December 2006 at
the Wayback Machine
Okneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback
Machine
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on zacbrld/MNLP_M2_document_encoder
This is a sentence-transformers model finetuned from zacbrld/MNLP_M2_document_encoder. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: zacbrld/MNLP_M2_document_encoder
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("zacbrld/MNLP_M2_document_encoder")
# Run inference
sentences = [
'Burray and The Barriers\nUndiscovered Scotland: The Churchill Barriers\nOur Past History: The Churchill Barriers Archived 17 December 2006 at the Wayback Machine\nOkneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback Machine',
'Burray and The Barriers\nUndiscovered Scotland: The Churchill Barriers\nOur Past History: The Churchill Barriers Archived 17 December 2006 at the Wayback Machine\nOkneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback Machine',
'Cerebellar Purkinje neurons have been proposed to have two distinct bursting modes: dendritically driven, by dendritic Ca2+ spikes, and somatically driven, wherein the persistent Na+ current is the burst initiator and the SK K+ current is the burst terminator. Purkinje neurons may utilise these bursting forms in information coding to the deep cerebellar nuclei.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 5,489 training samples
- Columns:
sentence_0andsentence_1 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 34 tokens
- mean: 144.23 tokens
- max: 256 tokens
- min: 34 tokens
- mean: 144.23 tokens
- max: 256 tokens
- Samples:
sentence_0 sentence_1 In related work, Smoller, Temple, and Vogler propose that this shockwave may have resulted in our part of the universe having a lower density than that surrounding it, causing the accelerated expansion normally attributed to dark energy.
They also propose that this related theory could be tested: a universe with dark energy should give a figure for the cubic correction to redshift versus luminosity C = −0.180 at a = a whereas for Smoller, Temple, and Vogler's alternative C should be positive rather than negative. They give a more precise calculation for their wave model alternative as: the cubic correction to redshift versus luminosity at a = a is C = 0.359.In related work, Smoller, Temple, and Vogler propose that this shockwave may have resulted in our part of the universe having a lower density than that surrounding it, causing the accelerated expansion normally attributed to dark energy.
They also propose that this related theory could be tested: a universe with dark energy should give a figure for the cubic correction to redshift versus luminosity C = −0.180 at a = a whereas for Smoller, Temple, and Vogler's alternative C should be positive rather than negative. They give a more precise calculation for their wave model alternative as: the cubic correction to redshift versus luminosity at a = a is C = 0.359.Evolution is a central organizing concept in biology. It is the change in heritable characteristics of populations over successive generations. In artificial selection, animals were selectively bred for specific traits.
Given that traits are inherited, populations contain a varied mix of traits, and reproduction is able to increase any population, Darwin argued that in the natural world, it was nature that played the role of humans in selecting for specific traits. Darwin inferred that individuals who possessed heritable traits better adapted to their environments are more likely to survive and produce more offspring than other individuals. He further inferred that this would lead to the accumulation of favorable traits over successive generations, thereby increasing the match between the organisms and their environment.Evolution is a central organizing concept in biology. It is the change in heritable characteristics of populations over successive generations. In artificial selection, animals were selectively bred for specific traits.
Given that traits are inherited, populations contain a varied mix of traits, and reproduction is able to increase any population, Darwin argued that in the natural world, it was nature that played the role of humans in selecting for specific traits. Darwin inferred that individuals who possessed heritable traits better adapted to their environments are more likely to survive and produce more offspring than other individuals. He further inferred that this would lead to the accumulation of favorable traits over successive generations, thereby increasing the match between the organisms and their environment.The total number of engineers employed in the U.S. in 2015 was roughly 1.6 million. Of these, 278,340 were mechanical engineers (17.28%), the largest discipline by size. In 2012, the median annual income of mechanical engineers in the U.S. workforce was $80,580. The median income was highest when working for the government ($92,030), and lowest in education ($57,090). In 2014, the total number of mechanical engineering jobs was projected to grow 5% over the next decade. As of 2009, the average starting salary was $58,800 with a bachelor's degree.The total number of engineers employed in the U.S. in 2015 was roughly 1.6 million. Of these, 278,340 were mechanical engineers (17.28%), the largest discipline by size. In 2012, the median annual income of mechanical engineers in the U.S. workforce was $80,580. The median income was highest when working for the government ($92,030), and lowest in education ($57,090). In 2014, the total number of mechanical engineering jobs was projected to grow 5% over the next decade. As of 2009, the average starting salary was $58,800 with a bachelor's degree. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 5multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 1.4535 | 500 | 0.0002 |
| 2.9070 | 1000 | 0.0 |
| 4.3605 | 1500 | 0.0007 |
Framework Versions
- Python: 3.10.11
- Sentence Transformers: 3.4.1
- Transformers: 4.51.3
- PyTorch: 2.6.0
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}