Add new SentenceTransformer model

3a67d7c verified 9 months ago

27.9 kB

	---
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:5700
	- loss:TripletLoss
	base_model: thenlper/gte-small
	widget:
	- source_sentence: Suppose there is a correlation of r = 0.9 between number of hours
	per day students study and GPAs. Which of the following is a reasonable conclusion?
	sentences:
	- 'Ulcerative Colitis

	'
	- Given that the sample has a standard deviation of zero, which of the following
	is a true statement?
	- Which of the following items is not subject to the application of intraperiod
	income tax allocation?
	- source_sentence: The natural law fallacy is
	sentences:
	- The Theory of _________ posits that 3 three levels of moral reasoning exist which
	an individual can engage in to assess ethical issues, dependant on their cognitive
	capacity.
	- Which of the following is another name for the fallacy of amphiboly?
	- 'Which of the following are plausible approaches to dealing with a model that
	exhibits heteroscedasticity?


	i) Take logarithms of each of the variables


	ii) Use suitably modified standard errors


	iii) Use a generalised least squares procedure


	iv) Add lagged values of the variables to the regression equation.'
	- source_sentence: When the ratio of brain size to body size is compared, which species
	has a proportionally larger brain?
	sentences:
	- 'A proposed explanation for some phenomenon that may be derived initially from
	empirical observation through a process called induction is a:'
	- 'Let R be a ring and let U and V be (two-sided) ideals of R. Which of the following
	must also be ideals of R ?

	I. {u + v : u \in and v \in V}

	II. {uv : u \in U and v \in V}

	III. {x : x \in U and x \in V}'
	- Find 3 over 4 − 1 over 8.
	- source_sentence: The AH Protocol provides source authentication and data integrity,
	but not
	sentences:
	- 'Ethnographic research produces qualitative data because:'
	- Let V be the real vector space of all real 2 x 3 matrices, and let W be the real
	vector space of all real 4 x 1 column vectors. If T is a linear transformation
	from V onto W, what is the dimension of the subspace kernel of T?
	- Which of the following is not a block cipher operating mode?
	- source_sentence: 'This question refers to the following information.

	Gunpowder Weaponry: Europe vs. China

	In Western Europe during the 1200s through the 1400s, early cannons, as heavy
	and as slow to fire as they were, proved useful enough in the protracted sieges
	that dominated warfare during this period that governments found it sufficiently
	worthwhile to pay for them and for the experimentation that eventually produced
	gunpowder weapons that were both more powerful and easier to move. By contrast,
	China, especially after the mid-1300s, was threatened mainly by highly mobile
	steppe nomads, against whom early gunpowder weapons, with their unwieldiness,
	proved of little utility. It therefore devoted its efforts to the improvement
	of horse archer units who could effectively combat the country''s deadliest foe.

	According to this passage, why did the Chinese, despite inventing gunpowder, fail
	to lead in the innovation of gunpowder weaponry?'
	sentences:
	- Statement 1\| Maximizing the likelihood of logistic regression model yields multiple
	local optimums. Statement 2\| No classifier can do better than a naive Bayes classifier
	if the distribution of the data is known.
	- What is the term for decisions limited by human capacity to absorb and analyse
	information?
	- 'This question refers to the following information.

	By what principle of reason then, should these foreigners send in return a poisonous
	drug? Without meaning to say that the foreigners harbor such destructive intentions
	in their hearts, we yet positively assert that from their inordinate thirst after
	gain, they are perfectly careless about the injuries they inflict upon us! And
	such being the case, we should like to ask what has become of that conscience
	which heaven has implanted in the breasts of all men? We have heard that in your
	own country opium is prohibited with the utmost strictness and severity. This
	is a strong proof that you know full well how hurtful it is to mankind. Since
	you do not permit it to injure your own country, you ought not to have this injurious
	drug transferred to another country, and above all others, how much less to the
	Inner Land! Of the products which China exports to your foreign countries, there
	is not one which is not beneficial to mankind in some shape or other.

	Lin Zexu, Chinese trade commissioner, letter to Queen Victoria, 1839

	On which of the following arguments does the author of the passage principally
	base his appeal?'
	pipeline_tag: sentence-similarity
	library_name: sentence-transformers
	---

	# SentenceTransformer based on thenlper/gte-small

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thenlper/gte-small](https://huggingface.co/thenlper/gte-small). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [thenlper/gte-small](https://huggingface.co/thenlper/gte-small) <!-- at revision 17e1f347d17fe144873b1201da91788898c639cd -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 384 dimensions
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("Alexhuou/embedder_model_STxmmluV3")
	# Run inference
	sentences = [
	"This question refers to the following information.\nGunpowder Weaponry: Europe vs. China\nIn Western Europe during the 1200s through the 1400s, early cannons, as heavy and as slow to fire as they were, proved useful enough in the protracted sieges that dominated warfare during this period that governments found it sufficiently worthwhile to pay for them and for the experimentation that eventually produced gunpowder weapons that were both more powerful and easier to move. By contrast, China, especially after the mid-1300s, was threatened mainly by highly mobile steppe nomads, against whom early gunpowder weapons, with their unwieldiness, proved of little utility. It therefore devoted its efforts to the improvement of horse archer units who could effectively combat the country's deadliest foe.\nAccording to this passage, why did the Chinese, despite inventing gunpowder, fail to lead in the innovation of gunpowder weaponry?",
	'This question refers to the following information.\nBy what principle of reason then, should these foreigners send in return a poisonous drug? Without meaning to say that the foreigners harbor such destructive intentions in their hearts, we yet positively assert that from their inordinate thirst after gain, they are perfectly careless about the injuries they inflict upon us! And such being the case, we should like to ask what has become of that conscience which heaven has implanted in the breasts of all men? We have heard that in your own country opium is prohibited with the utmost strictness and severity. This is a strong proof that you know full well how hurtful it is to mankind. Since you do not permit it to injure your own country, you ought not to have this injurious drug transferred to another country, and above all others, how much less to the Inner Land! Of the products which China exports to your foreign countries, there is not one which is not beneficial to mankind in some shape or other.\nLin Zexu, Chinese trade commissioner, letter to Queen Victoria, 1839\nOn which of the following arguments does the author of the passage principally base his appeal?',
	'What is the term for decisions limited by human capacity to absorb and analyse information?',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 384]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset

	* Size: 5,700 training samples
	* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence_0 \| sentence_1 \| sentence_2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 47.47 tokens</li><li>max: 512 tokens</li></ul> \| <ul><li>min: 3 tokens</li><li>mean: 51.06 tokens</li><li>max: 512 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 47.92 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence_0 \| sentence_1 \| sentence_2 \|
	\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>This question refers to the following information.<br>Let us not, I beseech you sir, deceive ourselves. Sir, we have done everything that could be done, to avert the storm which is now coming on. We have petitioned; we have remonstrated; we have supplicated; we have prostrated ourselves before the throne, and have implored its interposition to arrest the tyrannical hands of the ministry and Parliament. Our petitions have been slighted; our remonstrances have produced additional violence and insult; our supplications have been disregarded; and we have been spurned, with contempt, from the foot of the throne. In vain, after these things, may we indulge the fond hope of peace and reconciliation. There is no longer any room for hope.… It is in vain, sir, to extenuate the matter. Gentlemen may cry, Peace, Peace, but there is no peace. The war is actually begun! The next gale that sweeps from the north will bring to our ears the clash of resounding arms! Our brethren are already in the field! W...</code> \| <code>This question refers to the following information.<br>"In one view the slaveholders have a decided advantage over all opposition. It is well to notice this advantage—the advantage of complete organization. They are organized; and yet were not at the pains of creating their organizations. The State governments, where the system of slavery exists, are complete slavery organizations. The church organizations in those States are equally at the service of slavery; while the Federal Government, with its army and navy, from the chief magistracy in Washington, to the Supreme Court, and thence to the chief marshalship at New York, is pledged to support, defend, and propagate the crying curse of human bondage. The pen, the purse, and the sword, are united against the simple truth, preached by humble men in obscure places."<br>Frederick Douglass, 1857<br>Frederick Douglass was most influenced by which of the following social movements?</code> \| <code>Replacing supply chains with _______ enhances the importance of product _______as well as a fundamental redesign of every activity a firm engages in that produces _______.</code> \|
	\| <code>Which of the following is a true statement about program documentation?</code> \| <code>The boolean expression a[i] == max \|\| !(max != a[i]) can be simplified to</code> \| <code>The insurance program for poor people of all ages is called</code> \|
	\| <code>If both parents are affected with the same autosomal recessive disorder then the probability that each of their children will be affected equals ___.</code> \| <code>Which of the following conditions shows anticipation in paternal transmission?</code> \| <code>From 1988 to 1990 among heterosexuals in the US, the number of unmarried adults aged 20 to 45 who report having multiple partners has:</code> \|
	* Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
	```json
	{
	"distance_metric": "TripletDistanceMetric.EUCLIDEAN",
	"triplet_margin": 5
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `num_train_epochs`: 5
	- `multi_dataset_batch_sampler`: round_robin

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: no
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 8
	- `per_device_eval_batch_size`: 8
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 1
	- `eval_accumulation_steps`: None
	- `torch_empty_cache_steps`: None
	- `learning_rate`: 5e-05
	- `weight_decay`: 0.0
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1
	- `num_train_epochs`: 5
	- `max_steps`: -1
	- `lr_scheduler_type`: linear
	- `lr_scheduler_kwargs`: {}
	- `warmup_ratio`: 0.0
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: False
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: False
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: False
	- `resume_from_checkpoint`: None
	- `hub_model_id`: None
	- `hub_strategy`: every_save
	- `hub_private_repo`: None
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `include_for_metrics`: []
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: False
	- `use_liger_kernel`: False
	- `eval_use_gather_object`: False
	- `average_tokens_across_devices`: False
	- `prompts`: None
	- `batch_sampler`: batch_sampler
	- `multi_dataset_batch_sampler`: round_robin

	</details>

	### Training Logs
	\| Epoch \| Step \| Training Loss \|
	\|:------:\|:----:\|:-------------:\|
	\| 0.7013 \| 500 \| 1.9382 \|
	\| 1.4025 \| 1000 \| 1.0882 \|
	\| 2.1038 \| 1500 \| 0.8478 \|
	\| 2.8050 \| 2000 \| 0.5961 \|
	\| 3.5063 \| 2500 \| 0.5179 \|
	\| 4.2076 \| 3000 \| 0.3774 \|
	\| 4.9088 \| 3500 \| 0.3646 \|


	### Framework Versions
	- Python: 3.11.13
	- Sentence Transformers: 4.1.0
	- Transformers: 4.52.4
	- PyTorch: 2.6.0+cu124
	- Accelerate: 1.7.0
	- Datasets: 3.6.0
	- Tokenizers: 0.21.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### TripletLoss
	```bibtex
	@misc{hermans2017defense,
	title={In Defense of the Triplet Loss for Person Re-Identification},
	author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
	year={2017},
	eprint={1703.07737},
	archivePrefix={arXiv},
	primaryClass={cs.CV}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->