---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1136292
- loss:CachedMultipleNegativesRankingLoss
base_model: answerdotai/ModernBERT-base
widget:
- source_sentence: During the 1960s Willard Cochrane was U.S. Department of Agriculture's
head agricultural economist under U.S. Secretary of Agriculture Orville Freeman.
sentences:
- Cosmic Smash publisher Sega, platform Dreamcast.
- Willard Cochrane occupation Economist.
- Willard Cochrane educated at Harvard University, educated at Montana State University,
date of birth 15 May 1914.
- source_sentence: Four Moons stars Antonio Velázquez, Alejandro de la Madrid, César
Ramos, Gustavo Egelhaaf, Alonso Echánove, Alejandro Belmonte, Karina Gidi and
Juan Manuel Bernal.
sentences:
- Four Moons cast member Juan Manuel Bernal, cast member Antonio Velázquez, cast
member Alejandro de la Madrid, RTC film rating C.
- Leukotriene C4 synthase ortholog Ltc4s, ortholog Ltc4s, instance of Gene.
- Four Moons publication date 27 April 2015.
- source_sentence: James B. Kirby (September 28, 1884 - June 9, 1971) was an American
inventor and self-taught electrical engineer who focused Jim Kirby's career on
"eliminating the drudgery of housework".
sentences:
- Jim Kirby sex or gender male.
- Kimberlé Williams Crenshaw notable work Intersectionality, field of work Intersectionality.
- Jim Kirby date of death 09 June 1971, occupation Inventor, date of birth 28 September
1884.
- source_sentence: Isabel Montero de la Cámara began work in the Foreign Office on
June 18, 1974. and was appointed ambassador on April 9, 1996.
sentences:
- Back in Baby 's Arms publication date 00 1969, instance of Album.
- Isabel Montero de la Cámara occupation Diplomat, country of citizenship Costa
Rica, date of birth 01 January 1942.
- Isabel Montero de la Cámara position held Ambassador.
- source_sentence: In 1842 Alvars married the harpist Melanie Lewy, a member of a
Vienna-based family of musicians with whom Alvars frequently performed.
sentences:
- Elias Parish Alvars place of birth Teignmouth.
- Olivia of Palermo date of death 10 June 0463, sex or gender female, feast day
June 10.
- Elias Parish Alvars spouse Melanie Lewy, place of death Vienna.
datasets:
- YesaOuO/TEKGEN-CTSP
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on answerdotai/ModernBERT-base
results:
- task:
type: triplet
name: Triplet
dataset:
name: YesaOuO/TEKGEN CTSP
type: YesaOuO/TEKGEN-CTSP
metrics:
- type: cosine_accuracy
value: 0.916620671749115
name: Cosine Accuracy
---
# SentenceTransformer based on answerdotai/ModernBERT-base
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [tekgen-ctsp](https://huggingface.co/datasets/YesaOuO/TEKGEN-CTSP) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
- **Maximum Sequence Length:** 8192 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
- [tekgen-ctsp](https://huggingface.co/datasets/YesaOuO/TEKGEN-CTSP)
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("YesaOuO/ModernBERT-base-CTSP")
# Run inference
sentences = [
'In 1842 Alvars married the harpist Melanie Lewy, a member of a Vienna-based family of musicians with whom Alvars frequently performed.',
'Elias Parish Alvars spouse Melanie Lewy, place of death Vienna.',
'Elias Parish Alvars place of birth Teignmouth.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Triplet
* Dataset: `YesaOuO/TEKGEN-CTSP`
* Evaluated with [TripletEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| **cosine_accuracy** | **0.9166** |
## Training Details
### Training Dataset
#### tekgen-ctsp
* Dataset: [tekgen-ctsp](https://huggingface.co/datasets/YesaOuO/TEKGEN-CTSP) at [8d091eb](https://huggingface.co/datasets/YesaOuO/TEKGEN-CTSP/tree/8d091ebc57b429b55add63e77a0408fa8dc3732b)
* Size: 1,136,292 training samples
* Columns: anchor, positive, and negative
* Approximate statistics based on the first 1000 samples:
| | anchor | positive | negative |
|:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
| type | string | string | string |
| details |
1976 Swedish Grand Prix was the seventh round of the 1976 Formula One season and the ninth Swedish Grand Prix. | 1976 Swedish Grand Prix point in time 13 June 1976, part of 1976 Formula One season. | 1976 Swedish Grand Prix pole position Jody Scheckter, winner Jody Scheckter. |
| 1976 Swedish Grand Prix was the seventh round of the 1976 Formula One season and the ninth Swedish Grand Prix. | 1976 Swedish Grand Prix point in time 13 June 1976, part of 1976 Formula One season. | 1976 Swedish Grand Prix point in time 13 June 1976, country Sweden. |
| 1976 Swedish Grand Prix was the seventh round of the 1976 Formula One season and the ninth Swedish Grand Prix. | 1976 Swedish Grand Prix point in time 13 June 1976, part of 1976 Formula One season. | 1976 Swedish Grand Prix point in time 13 June 1976. |
* Loss: [CachedMultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Evaluation Dataset
#### tekgen-ctsp
* Dataset: [tekgen-ctsp](https://huggingface.co/datasets/YesaOuO/TEKGEN-CTSP) at [8d091eb](https://huggingface.co/datasets/YesaOuO/TEKGEN-CTSP/tree/8d091ebc57b429b55add63e77a0408fa8dc3732b)
* Size: 10,866 evaluation samples
* Columns: anchor, positive, and negative
* Approximate statistics based on the first 1000 samples:
| | anchor | positive | negative |
|:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
| type | string | string | string |
| details | Two men with prior criminal records, Dieter Degowski and Hans-Jürgen Rösner, went on the run for two days through Germany and the Netherlands. | Gladbeck hostage crisis country Netherlands, country Germany, participant Hans-Jürgen Rösner, participant Dieter Degowski. | Gladbeck hostage crisis end time 18 August 1988, point in time 18 August 1988, country Germany, start time 16 August 1988. |
| The Gladbeck hostage crisis (known in Germany as the Gladbeck hostage drama) was a hostage-taking crisis that happened in August 1988 after an armed bank raid in Gladbeck, North Rhine-Westphalia, West Germany. | Gladbeck hostage crisis end time 18 August 1988, point in time 18 August 1988, country Germany, start time 16 August 1988. | Gladbeck hostage crisis country Netherlands, country Germany, participant Hans-Jürgen Rösner, participant Dieter Degowski. |
| The album was originally released only on cassette tape before later being made available for digital download on iTunes and similar digital media stores. | Vongole Fisarmonica instance of Album. | Vongole Fisarmonica performer Those Darn Accordions, publication date 01 January 1992, instance of Album. |
* Loss: [CachedMultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `per_device_train_batch_size`: 512
- `per_device_eval_batch_size`: 512
- `learning_rate`: 8e-05
- `num_train_epochs`: 1
- `warmup_ratio`: 0.05
- `bf16`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters