metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:5700
- loss:TripletLoss
base_model: thenlper/gte-small
widget:
- source_sentence: >-
Perchloric acid (HClO4) is considered one of the stronger acids in
existence. Which of the following statements corresponds most accurately
with strong acids?
sentences:
- >-
Who argued that if an organization did not affect a public then there
was no need for a practitioner to consider that public in its
communications?
- 'Glycogen breakdown in exercising muscle is activated by:'
- The collision theory of reaction rates does not include
- source_sentence: 'In Aristotle’s terminology, incontinence is when:'
sentences:
- >-
This question refers to the following information.
The pair of excerpts below is written by explorer Christopher Columbus
and the Dominican Bishop of Chiapas, Mexico, Bartholomew de las Casas.
Source 1
Indians would give whatever the seller required. . . . Thus they
bartered, like idiots, cotton and gold for fragments of bows, glasses,
bottles, and jars; which I forbad as being unjust, and myself gave them
many beautiful and acceptable articles which I had brought with me,
taking nothing from them in return; I did this in order that I might the
more easily conciliate them, that they might be led to become
Christians, and be inclined to entertain a regard for the King and
Queen, our Princes and all Spaniards, and that I might induce them to
take an interest in seeking out, and collecting and delivering to us
such things as they possessed in abundance, but which we greatly needed.
—Christopher Columbus: letter to Raphael Sanchez, 1493
Source 2
It was upon these gentle lambs . . . that from the very first day they
clapped eyes on them the Spanish fell like ravening wolves upon the
fold, or like tigers and savage lions who have not eaten meat for days.
The pattern established at the outset has remained unchanged to this
day, and the Spaniards still do nothing save tear the natives to shreds,
murder them and inflict upon them untold misery, suffering and distress,
tormenting, harrying and persecuting them mercilessly. We shall in due
course describe some of the many ingenious methods of torture they have
invented and refined for this purpose, but one can get some idea of the
effectiveness of their methods from the figures alone. When the Spanish
first journeyed there, the indigenous population of the island of
Hispaniola stood at some three million; today only two hundred survive.
Their reason for killing and destroying such an infinite number of souls
is that the Christians have an ultimate aim, which is to acquire gold,
and to swell themselves with riches in a very brief time and thus rise
to a high estate disproportionate to their merits.
—Bartholomew de las Casas: A Short Account of the Destruction of the
Indies, 1542
Which of the following would best account for the differences between
the interactions of the Spaniards and the natives as described in the
two accounts?
- >-
For Plato, ordinary sensible objects exist and are knowable as examples
or instances of Ideas or "Forms" that do not exist in our ordinary
sensible world. Forms do not exist in the sensible world because:
- >-
A solution of a weak base is titrated with a solution of a standard
strong acid. The progress of the titration is followed with a pH meter.
Which of the following observations would occur?
- source_sentence: Which of the following causes more deaths globally each year (as of 2017)?
sentences:
- >-
About what percentage of survey respondents from China report having
paid a bribe in the last year to access public services (such as
education; judiciary; medical and health; police; registry and permit
services; utilities; tax revenue and customs; and land service) as of
2017?
- ' In response to the objection that it would be wrong to prohibit the manufacture and sale of fatty foods and tobacco products, de Marneffe argues that'
- Which of the following about meiosis is NOT true?
- source_sentence: What is 'unilateral acts'?
sentences:
- >-
Which of the following statements is true concerning the population
regression function (PRF) and sample regression function (SRF)?
- Find the maximum possible order for some element of Z_8 x Z_10 x Z_24.
- What is jus cogens?
- source_sentence: >-
This question refers to the following information.
"Those whose condition is such that their function is the use of their
bodies and nothing better can be expected of them, those, I say, are
slaves of nature. It is better for them to be ruled thus."
Juan de Sepulveda, Politics, 1522
"When Latin American nations gained independence in the 19th century,
those two strains converged, and merged with an older, more universalist,
natural law tradition. The result was a distinctively Latin American form
of rights discourse. Paolo Carozza traces the roots of that discourse to a
distinctive application, and extension, of Thomistic moral philosophy to
the injustices of Spanish conquests in the New World. The key figure in
that development seems to have been Bartolomé de Las Casas, a 16th-century
Spanish bishop who condemned slavery and championed the cause of Indians
on the basis of a natural right to liberty grounded in their membership in
a single common humanity. 'All the peoples of the world are humans,' Las
Casas wrote, and 'all the races of humankind are one.' According to Brian
Tierney, Las Casas and other Spanish Dominican philosophers laid the
groundwork for a doctrine of natural rights that was independent of
religious revelation 'by drawing on a juridical tradition that derived
natural rights and natural law from human rationality and free will, and
by appealing to Aristotelian philosophy.'"
Mary Ann Glendon, "The Forgotten Crucible: The Latin American Influence on
the Universal Human Rights Idea,” 2003
Which one of the following statements about the Spanish conquest of the
Americas is most accurate?
sentences:
- >-
Statement 1 | If T: V -> W is a linear transformation and dim(V ) <
dim(W) < 1, then T must be injective. Statement 2 | Let dim(V) = n and
suppose that T: V -> V is linear. If T is injective, then it is a
bijection.
- >-
If the finite group G contains a subgroup of order seven but no element
(other than the identity) is its own inverse, then the order of G could
be
- >-
This question refers to the following information.
"One-half of the people of this nation to-day are utterly powerless to
blot from the statute books an unjust law, or to write there a new and a
just one. The women, dissatisfied as they are with this form of
government, that enforces taxation without representation,—that compels
them to obey laws to which they have never given their consent,—that
imprisons and hangs them without a trial by a jury of their peers, that
robs them, in marriage, of the custody of their own persons, wages and
children,—are this half of the people left wholly at the mercy of the
other half, in direct violation of the spirit and letter of the
declarations of the framers of this government, every one of which was
based on the immutable principle of equal rights to all."
—Susan B. Anthony, "I Stand Before You Under Indictment" (speech), 1873
Which of the following statements best represents the criticism of
Andrew Carnegie found in this cartoon?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on thenlper/gte-small
This is a sentence-transformers model finetuned from thenlper/gte-small. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: thenlper/gte-small
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Alexhuou/embedder_model_FT")
# Run inference
sentences = [
'This question refers to the following information.\n"Those whose condition is such that their function is the use of their bodies and nothing better can be expected of them, those, I say, are slaves of nature. It is better for them to be ruled thus."\nJuan de Sepulveda, Politics, 1522\n"When Latin American nations gained independence in the 19th century, those two strains converged, and merged with an older, more universalist, natural law tradition. The result was a distinctively Latin American form of rights discourse. Paolo Carozza traces the roots of that discourse to a distinctive application, and extension, of Thomistic moral philosophy to the injustices of Spanish conquests in the New World. The key figure in that development seems to have been Bartolomé de Las Casas, a 16th-century Spanish bishop who condemned slavery and championed the cause of Indians on the basis of a natural right to liberty grounded in their membership in a single common humanity. \'All the peoples of the world are humans,\' Las Casas wrote, and \'all the races of humankind are one.\' According to Brian Tierney, Las Casas and other Spanish Dominican philosophers laid the groundwork for a doctrine of natural rights that was independent of religious revelation \'by drawing on a juridical tradition that derived natural rights and natural law from human rationality and free will, and by appealing to Aristotelian philosophy.\'"\nMary Ann Glendon, "The Forgotten Crucible: The Latin American Influence on the Universal Human Rights Idea,” 2003\nWhich one of the following statements about the Spanish conquest of the Americas is most accurate?',
'This question refers to the following information.\n"One-half of the people of this nation to-day are utterly powerless to blot from the statute books an unjust law, or to write there a new and a just one. The women, dissatisfied as they are with this form of government, that enforces taxation without representation,—that compels them to obey laws to which they have never given their consent,—that imprisons and hangs them without a trial by a jury of their peers, that robs them, in marriage, of the custody of their own persons, wages and children,—are this half of the people left wholly at the mercy of the other half, in direct violation of the spirit and letter of the declarations of the framers of this government, every one of which was based on the immutable principle of equal rights to all."\n—Susan B. Anthony, "I Stand Before You Under Indictment" (speech), 1873\nWhich of the following statements best represents the criticism of Andrew Carnegie found in this cartoon?',
'If the finite group G contains a subgroup of order seven but no element (other than the identity) is its own inverse, then the order of G could be',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 5,700 training samples
- Columns:
sentence_0,sentence_1, andsentence_2 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 type string string string details - min: 3 tokens
- mean: 44.65 tokens
- max: 512 tokens
- min: 3 tokens
- mean: 44.72 tokens
- max: 512 tokens
- min: 3 tokens
- mean: 47.14 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 sentence_2 This question refers to the following information.
"When the Portuguese go from Macao in China to Japan, they carry much white silk, gold, musk, and porcelain: and they bring from Japan nothing but silver. They have a great carrack which goes there every year and she brings from there every year about six hundred coins: and all this silver of Japan, and two hundred thousand coins more in silver which they bring yearly out of India, they employ to their great advantage in China: and they bring from there gold, musk, silk, copper, porcelains, and many other things very costly and gilded.
When the Portuguese come to Canton in China to traffic, they must remain there but certain days: and when they come in at the gate of the city, they must enter their names in a book, and when they go out at night they must put out their names. They may not lie in the town all night, but must lie in their boats outside of the town. And, their time expired, if any man remains there, he is imprisoned."
Ralp...This question refers to the following information.
Although in Protestant Europe, [Peter the Great] was surrounded by evidence of the new civil and political rights of individual men embodied in constitutions, bills of rights and parliaments, he did not return to Russia determined to share power with his people. On the contrary, he returned not only determined to change his country but also convinced that if Russia was to be transformed, it was he who must provide both the direction and the motive force. He would try to lead; but where education and persuasion were not enough, he could drive—and if necessary flog—the backward nation forward.
—Robert K. Massie, Peter the Great: His Life and World
Based on the above passage, what kinds of reforms did Peter the Great embrace?This question refers to the following information.
Now, we have organized a society, and we call it "Share Our Wealth Society," a society with the motto "Every Man a King."…
We propose to limit the wealth of big men in the country. There is an average of $15,000 in wealth to every family in America. That is right here today.
We do not propose to divide it up equally. We do not propose a division of wealth, but we do propose to limit poverty that we will allow to be inflicted on any man's family. We will not say we are going to try to guarantee any equality … but we do say that one third of the average is low enough for any one family to hold, that there should be a guarantee of a family wealth of around $5,000; enough for a home, an automobile, a radio, and the ordinary conveniences, and the opportunity to educate their children.…
We will have to limit fortunes. Our present plan is that we will allow no man to own more than $50,000,000. We think that with that limit we will be able to ...This question refers to the following information.
An Act to place certain restrictions on Immigration and to provide for the removal from the Commonwealth of Prohibited Immigrants.
…
3. The immigration into the Commonwealth of the persons described in any of the following paragraphs in this section (hereinafter called "prohibited immigrants") is prohibited, namely
(a) Any person who when asked to do so by an officer fails to write out at dictation and sign in the presence of the officer a passage of fifty words in length in a European language directed by the officer;
(b) Any person in the opinion of the Minister or of an officer to become a charge upon the public or upon any public or charitable organisation;
…
(g) Any persons under a contract or agreement to perform manual labour within the Commonwealth: Provided that this paragraph shall not apply to workmen exempted by the Minister for special skill required by Australia…
Immigration Restriction Act of 1901 (Australia)
Whereas in ...This question refers to the following information.
"My little homestead in the city, which I recently insured for £2,000 would no doubt have shared the common fate, as the insurance companies will not make good that which is destroyed by the Queen's enemies. And although I have a farm of 50 acres close to the town, no doubt the crops and premises would have been destroyed. In fact, this has already partly been the case, and I am now suing the Government for damages done by a contingent of 1,500 natives that have recently encamped not many hundred yards from the place, who have done much damage all around."
Letter from a British citizen to his sister during the Anglo-Zulu War, South Africa, 1879
Incidents such as those described by the author of the letter were used by the British government to do which of the following?If the price of a product decreases with the price of a substitute product remaining constant such that the consumer buys more of this product, this is called theThe term 'marketing mix' describes:____________________ are those who address the same target market but provide a different offering to satisfy the market need, for example Spotify, Sony, and Apple's iPod.The anatomic location of the spinal canal is - Loss:
TripletLosswith these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 1.4006 | 500 | 1.7059 |
| 2.8011 | 1000 | 0.88 |
| 0.7013 | 500 | 0.7526 |
| 1.4025 | 1000 | 0.6176 |
| 2.1038 | 1500 | 0.4602 |
| 2.8050 | 2000 | 0.3472 |
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}