Add new SentenceTransformer model
Browse files- README.md +167 -155
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -7,77 +7,87 @@ tags:
|
|
| 7 |
- sentence-similarity
|
| 8 |
- feature-extraction
|
| 9 |
- generated_from_trainer
|
| 10 |
-
- dataset_size:
|
| 11 |
- loss:MatryoshkaLoss
|
| 12 |
- loss:MultipleNegativesRankingLoss
|
| 13 |
base_model: nomic-ai/modernbert-embed-base
|
| 14 |
widget:
|
| 15 |
-
- source_sentence:
|
| 16 |
sentences:
|
| 17 |
-
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
-
|
| 23 |
-
|
| 24 |
-
lies not only in their size but also in the diversity of tasks they can perform
|
| 25 |
-
with little to no task-specific training
|
| 26 |
-
- . For example, integrating an LLM into a customer support chatbot might involve
|
| 27 |
-
connecting it to a company’s internal knowledge base, enabling it to answer customer
|
| 28 |
-
questions using accurate, up-to-date information.
|
| 29 |
-
- source_sentence: What is one method mentioned for fine-tuning the LLM?
|
| 30 |
-
sentences:
|
| 31 |
-
- Furthermore, advanced integrations might include fine-tuning the LLM on domain-specific
|
| 32 |
-
data, or pairing it with retrieval-augmented generation (RAG) pipelines. In RAG
|
| 33 |
-
systems, the model first retrieves relevant documents from a database (like a
|
| 34 |
-
knowledge base), then generates a response using that context—significantly improving
|
| 35 |
-
the relevance and accuracy of the answers.
|
| 36 |
- However, deploying LLMs effectively in real-world applications often requires
|
| 37 |
LLM integration. This means embedding these models into systems, workflows, or
|
| 38 |
products where they can interact with other components like databases, APIs, user
|
| 39 |
interfaces, or even custom business logic
|
|
|
|
|
|
|
| 40 |
- . As organizations increasingly adopt these technologies, the ability to understand
|
| 41 |
and apply LLMs will be a critical skill in the AI-powered future.
|
| 42 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
sentences:
|
| 44 |
-
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
-
|
| 54 |
-
|
| 55 |
-
|
|
|
|
|
|
|
|
|
|
| 56 |
sentences:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
- LLMs work by learning statistical relationships between words and phrases, allowing
|
| 58 |
them to predict and generate language that feels natural. The power of these models
|
| 59 |
lies not only in their size but also in the diversity of tasks they can perform
|
| 60 |
with little to no task-specific training
|
| 61 |
-
- LLMs work by learning statistical relationships between words and phrases, allowing
|
| 62 |
-
them to predict and generate language that feels natural. The power of these models
|
| 63 |
-
lies not only in their size but also in the diversity of tasks they can perform
|
| 64 |
-
with little to no task-specific training
|
| 65 |
-
- . As organizations increasingly adopt these technologies, the ability to understand
|
| 66 |
-
and apply LLMs will be a critical skill in the AI-powered future.
|
| 67 |
-
- source_sentence: What does the use of RAG systems improve according to the text?
|
| 68 |
-
sentences:
|
| 69 |
-
- . For instance, a spam filter doesn’t just block emails with specific keywords—it
|
| 70 |
-
learns from thousands of examples what spam typically looks like.
|
| 71 |
-
- Furthermore, advanced integrations might include fine-tuning the LLM on domain-specific
|
| 72 |
-
data, or pairing it with retrieval-augmented generation (RAG) pipelines. In RAG
|
| 73 |
-
systems, the model first retrieves relevant documents from a database (like a
|
| 74 |
-
knowledge base), then generates a response using that context—significantly improving
|
| 75 |
-
the relevance and accuracy of the answers.
|
| 76 |
-
- Furthermore, advanced integrations might include fine-tuning the LLM on domain-specific
|
| 77 |
-
data, or pairing it with retrieval-augmented generation (RAG) pipelines. In RAG
|
| 78 |
-
systems, the model first retrieves relevant documents from a database (like a
|
| 79 |
-
knowledge base), then generates a response using that context—significantly improving
|
| 80 |
-
the relevance and accuracy of the answers.
|
| 81 |
pipeline_tag: sentence-similarity
|
| 82 |
library_name: sentence-transformers
|
| 83 |
metrics:
|
|
@@ -107,49 +117,49 @@ model-index:
|
|
| 107 |
type: dim_768
|
| 108 |
metrics:
|
| 109 |
- type: cosine_accuracy@1
|
| 110 |
-
value: 0.
|
| 111 |
name: Cosine Accuracy@1
|
| 112 |
- type: cosine_accuracy@3
|
| 113 |
-
value: 0.
|
| 114 |
name: Cosine Accuracy@3
|
| 115 |
- type: cosine_accuracy@5
|
| 116 |
-
value:
|
| 117 |
name: Cosine Accuracy@5
|
| 118 |
- type: cosine_accuracy@10
|
| 119 |
value: 1.0
|
| 120 |
name: Cosine Accuracy@10
|
| 121 |
- type: cosine_precision@1
|
| 122 |
-
value: 0.
|
| 123 |
name: Cosine Precision@1
|
| 124 |
- type: cosine_precision@3
|
| 125 |
-
value: 0.
|
| 126 |
name: Cosine Precision@3
|
| 127 |
- type: cosine_precision@5
|
| 128 |
-
value: 0.
|
| 129 |
name: Cosine Precision@5
|
| 130 |
- type: cosine_precision@10
|
| 131 |
-
value: 0.
|
| 132 |
name: Cosine Precision@10
|
| 133 |
- type: cosine_recall@1
|
| 134 |
-
value: 0.
|
| 135 |
name: Cosine Recall@1
|
| 136 |
- type: cosine_recall@3
|
| 137 |
-
value: 0.
|
| 138 |
name: Cosine Recall@3
|
| 139 |
- type: cosine_recall@5
|
| 140 |
-
value:
|
| 141 |
name: Cosine Recall@5
|
| 142 |
- type: cosine_recall@10
|
| 143 |
value: 1.0
|
| 144 |
name: Cosine Recall@10
|
| 145 |
- type: cosine_ndcg@10
|
| 146 |
-
value: 0.
|
| 147 |
name: Cosine Ndcg@10
|
| 148 |
- type: cosine_mrr@10
|
| 149 |
-
value: 0.
|
| 150 |
name: Cosine Mrr@10
|
| 151 |
- type: cosine_map@100
|
| 152 |
-
value: 0.
|
| 153 |
name: Cosine Map@100
|
| 154 |
- task:
|
| 155 |
type: information-retrieval
|
|
@@ -159,49 +169,49 @@ model-index:
|
|
| 159 |
type: dim_512
|
| 160 |
metrics:
|
| 161 |
- type: cosine_accuracy@1
|
| 162 |
-
value: 0.
|
| 163 |
name: Cosine Accuracy@1
|
| 164 |
- type: cosine_accuracy@3
|
| 165 |
-
value:
|
| 166 |
name: Cosine Accuracy@3
|
| 167 |
- type: cosine_accuracy@5
|
| 168 |
-
value:
|
| 169 |
name: Cosine Accuracy@5
|
| 170 |
- type: cosine_accuracy@10
|
| 171 |
value: 1.0
|
| 172 |
name: Cosine Accuracy@10
|
| 173 |
- type: cosine_precision@1
|
| 174 |
-
value: 0.
|
| 175 |
name: Cosine Precision@1
|
| 176 |
- type: cosine_precision@3
|
| 177 |
-
value: 0.
|
| 178 |
name: Cosine Precision@3
|
| 179 |
- type: cosine_precision@5
|
| 180 |
-
value: 0.
|
| 181 |
name: Cosine Precision@5
|
| 182 |
- type: cosine_precision@10
|
| 183 |
-
value: 0.
|
| 184 |
name: Cosine Precision@10
|
| 185 |
- type: cosine_recall@1
|
| 186 |
-
value: 0.
|
| 187 |
name: Cosine Recall@1
|
| 188 |
- type: cosine_recall@3
|
| 189 |
-
value:
|
| 190 |
name: Cosine Recall@3
|
| 191 |
- type: cosine_recall@5
|
| 192 |
-
value:
|
| 193 |
name: Cosine Recall@5
|
| 194 |
- type: cosine_recall@10
|
| 195 |
value: 1.0
|
| 196 |
name: Cosine Recall@10
|
| 197 |
- type: cosine_ndcg@10
|
| 198 |
-
value: 0.
|
| 199 |
name: Cosine Ndcg@10
|
| 200 |
- type: cosine_mrr@10
|
| 201 |
-
value: 0.
|
| 202 |
name: Cosine Mrr@10
|
| 203 |
- type: cosine_map@100
|
| 204 |
-
value: 0.
|
| 205 |
name: Cosine Map@100
|
| 206 |
- task:
|
| 207 |
type: information-retrieval
|
|
@@ -214,10 +224,10 @@ model-index:
|
|
| 214 |
value: 0.6666666666666666
|
| 215 |
name: Cosine Accuracy@1
|
| 216 |
- type: cosine_accuracy@3
|
| 217 |
-
value:
|
| 218 |
name: Cosine Accuracy@3
|
| 219 |
- type: cosine_accuracy@5
|
| 220 |
-
value:
|
| 221 |
name: Cosine Accuracy@5
|
| 222 |
- type: cosine_accuracy@10
|
| 223 |
value: 1.0
|
|
@@ -226,34 +236,34 @@ model-index:
|
|
| 226 |
value: 0.6666666666666666
|
| 227 |
name: Cosine Precision@1
|
| 228 |
- type: cosine_precision@3
|
| 229 |
-
value: 0.
|
| 230 |
name: Cosine Precision@3
|
| 231 |
- type: cosine_precision@5
|
| 232 |
-
value: 0.
|
| 233 |
name: Cosine Precision@5
|
| 234 |
- type: cosine_precision@10
|
| 235 |
-
value: 0.
|
| 236 |
name: Cosine Precision@10
|
| 237 |
- type: cosine_recall@1
|
| 238 |
value: 0.6666666666666666
|
| 239 |
name: Cosine Recall@1
|
| 240 |
- type: cosine_recall@3
|
| 241 |
-
value:
|
| 242 |
name: Cosine Recall@3
|
| 243 |
- type: cosine_recall@5
|
| 244 |
-
value:
|
| 245 |
name: Cosine Recall@5
|
| 246 |
- type: cosine_recall@10
|
| 247 |
value: 1.0
|
| 248 |
name: Cosine Recall@10
|
| 249 |
- type: cosine_ndcg@10
|
| 250 |
-
value: 0.
|
| 251 |
name: Cosine Ndcg@10
|
| 252 |
- type: cosine_mrr@10
|
| 253 |
-
value: 0.
|
| 254 |
name: Cosine Mrr@10
|
| 255 |
- type: cosine_map@100
|
| 256 |
-
value: 0.
|
| 257 |
name: Cosine Map@100
|
| 258 |
- task:
|
| 259 |
type: information-retrieval
|
|
@@ -263,49 +273,49 @@ model-index:
|
|
| 263 |
type: dim_128
|
| 264 |
metrics:
|
| 265 |
- type: cosine_accuracy@1
|
| 266 |
-
value: 0.
|
| 267 |
name: Cosine Accuracy@1
|
| 268 |
- type: cosine_accuracy@3
|
| 269 |
-
value: 0.
|
| 270 |
name: Cosine Accuracy@3
|
| 271 |
- type: cosine_accuracy@5
|
| 272 |
-
value:
|
| 273 |
name: Cosine Accuracy@5
|
| 274 |
- type: cosine_accuracy@10
|
| 275 |
-
value:
|
| 276 |
name: Cosine Accuracy@10
|
| 277 |
- type: cosine_precision@1
|
| 278 |
-
value: 0.
|
| 279 |
name: Cosine Precision@1
|
| 280 |
- type: cosine_precision@3
|
| 281 |
-
value: 0.
|
| 282 |
name: Cosine Precision@3
|
| 283 |
- type: cosine_precision@5
|
| 284 |
-
value: 0.
|
| 285 |
name: Cosine Precision@5
|
| 286 |
- type: cosine_precision@10
|
| 287 |
-
value: 0.
|
| 288 |
name: Cosine Precision@10
|
| 289 |
- type: cosine_recall@1
|
| 290 |
-
value: 0.
|
| 291 |
name: Cosine Recall@1
|
| 292 |
- type: cosine_recall@3
|
| 293 |
-
value: 0.
|
| 294 |
name: Cosine Recall@3
|
| 295 |
- type: cosine_recall@5
|
| 296 |
-
value:
|
| 297 |
name: Cosine Recall@5
|
| 298 |
- type: cosine_recall@10
|
| 299 |
-
value:
|
| 300 |
name: Cosine Recall@10
|
| 301 |
- type: cosine_ndcg@10
|
| 302 |
-
value: 0.
|
| 303 |
name: Cosine Ndcg@10
|
| 304 |
- type: cosine_mrr@10
|
| 305 |
-
value: 0.
|
| 306 |
name: Cosine Mrr@10
|
| 307 |
- type: cosine_map@100
|
| 308 |
-
value: 0.
|
| 309 |
name: Cosine Map@100
|
| 310 |
- task:
|
| 311 |
type: information-retrieval
|
|
@@ -315,49 +325,49 @@ model-index:
|
|
| 315 |
type: dim_64
|
| 316 |
metrics:
|
| 317 |
- type: cosine_accuracy@1
|
| 318 |
-
value: 0.
|
| 319 |
name: Cosine Accuracy@1
|
| 320 |
- type: cosine_accuracy@3
|
| 321 |
-
value:
|
| 322 |
name: Cosine Accuracy@3
|
| 323 |
- type: cosine_accuracy@5
|
| 324 |
-
value:
|
| 325 |
name: Cosine Accuracy@5
|
| 326 |
- type: cosine_accuracy@10
|
| 327 |
-
value:
|
| 328 |
name: Cosine Accuracy@10
|
| 329 |
- type: cosine_precision@1
|
| 330 |
-
value: 0.
|
| 331 |
name: Cosine Precision@1
|
| 332 |
- type: cosine_precision@3
|
| 333 |
-
value: 0.
|
| 334 |
name: Cosine Precision@3
|
| 335 |
- type: cosine_precision@5
|
| 336 |
-
value: 0.
|
| 337 |
name: Cosine Precision@5
|
| 338 |
- type: cosine_precision@10
|
| 339 |
-
value: 0.
|
| 340 |
name: Cosine Precision@10
|
| 341 |
- type: cosine_recall@1
|
| 342 |
-
value: 0.
|
| 343 |
name: Cosine Recall@1
|
| 344 |
- type: cosine_recall@3
|
| 345 |
-
value:
|
| 346 |
name: Cosine Recall@3
|
| 347 |
- type: cosine_recall@5
|
| 348 |
-
value:
|
| 349 |
name: Cosine Recall@5
|
| 350 |
- type: cosine_recall@10
|
| 351 |
-
value:
|
| 352 |
name: Cosine Recall@10
|
| 353 |
- type: cosine_ndcg@10
|
| 354 |
-
value: 0.
|
| 355 |
name: Cosine Ndcg@10
|
| 356 |
- type: cosine_mrr@10
|
| 357 |
-
value: 0.
|
| 358 |
name: Cosine Mrr@10
|
| 359 |
- type: cosine_map@100
|
| 360 |
-
value: 0.
|
| 361 |
name: Cosine Map@100
|
| 362 |
---
|
| 363 |
|
|
@@ -411,9 +421,9 @@ from sentence_transformers import SentenceTransformer
|
|
| 411 |
model = SentenceTransformer("Nuf-hugginface/modernbert-embed-quickb")
|
| 412 |
# Run inference
|
| 413 |
sentences = [
|
| 414 |
-
'What
|
| 415 |
-
'
|
| 416 |
-
'
|
| 417 |
]
|
| 418 |
embeddings = model.encode(sentences)
|
| 419 |
print(embeddings.shape)
|
|
@@ -458,23 +468,23 @@ You can finetune this model on your own dataset.
|
|
| 458 |
* Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
|
| 459 |
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
|
| 460 |
|
| 461 |
-
| Metric | dim_768 | dim_512
|
| 462 |
-
|:--------------------|:-----------|:----------|:-----------|:-----------|:----------
|
| 463 |
-
| cosine_accuracy@1 | 0.
|
| 464 |
-
| cosine_accuracy@3 | 0.
|
| 465 |
-
| cosine_accuracy@5 |
|
| 466 |
-
| cosine_accuracy@10 | 1.0 | 1.0
|
| 467 |
-
| cosine_precision@1 | 0.
|
| 468 |
-
| cosine_precision@3 | 0.
|
| 469 |
-
| cosine_precision@5 | 0.
|
| 470 |
-
| cosine_precision@10 | 0.1 | 0.1
|
| 471 |
-
| cosine_recall@1 | 0.
|
| 472 |
-
| cosine_recall@3 | 0.
|
| 473 |
-
| cosine_recall@5 |
|
| 474 |
-
| cosine_recall@10 | 1.0 | 1.0
|
| 475 |
-
| **cosine_ndcg@10** | **0.
|
| 476 |
-
| cosine_mrr@10 | 0.
|
| 477 |
-
| cosine_map@100 | 0.
|
| 478 |
|
| 479 |
<!--
|
| 480 |
## Bias, Risks and Limitations
|
|
@@ -494,19 +504,19 @@ You can finetune this model on your own dataset.
|
|
| 494 |
|
| 495 |
#### Unnamed Dataset
|
| 496 |
|
| 497 |
-
* Size:
|
| 498 |
* Columns: <code>anchor</code> and <code>positive</code>
|
| 499 |
-
* Approximate statistics based on the first
|
| 500 |
-
| | anchor
|
| 501 |
-
|:--------|:---------------------------------------------------------------------------------
|
| 502 |
-
| type | string
|
| 503 |
-
| details | <ul><li>min:
|
| 504 |
* Samples:
|
| 505 |
-
| anchor
|
| 506 |
-
|:-----------------------------------------------------------------------
|
| 507 |
-
| <code>What
|
| 508 |
-
| <code>
|
| 509 |
-
| <code>What
|
| 510 |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
|
| 511 |
```json
|
| 512 |
{
|
|
@@ -665,10 +675,12 @@ You can finetune this model on your own dataset.
|
|
| 665 |
</details>
|
| 666 |
|
| 667 |
### Training Logs
|
| 668 |
-
| Epoch | Step
|
| 669 |
-
|:-------:|:-----:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
|
| 670 |
-
| 1.0 |
|
| 671 |
-
|
|
|
|
|
|
|
|
| 672 |
|
| 673 |
* The bold row denotes the saved checkpoint.
|
| 674 |
|
|
|
|
| 7 |
- sentence-similarity
|
| 8 |
- feature-extraction
|
| 9 |
- generated_from_trainer
|
| 10 |
+
- dataset_size:129
|
| 11 |
- loss:MatryoshkaLoss
|
| 12 |
- loss:MultipleNegativesRankingLoss
|
| 13 |
base_model: nomic-ai/modernbert-embed-base
|
| 14 |
widget:
|
| 15 |
+
- source_sentence: In what contexts can LLMs be embedded according to the text?
|
| 16 |
sentences:
|
| 17 |
+
- Artificial Intelligence (AI) is the broad field of computer science that focuses
|
| 18 |
+
on building systems capable of performing tasks that normally require human intelligence.
|
| 19 |
+
These tasks include learning from experience, understanding language, recognizing
|
| 20 |
+
patterns, and making decisions. AI powers everything from smart assistants like
|
| 21 |
+
Siri to recommendation systems on Netflix and self-driving cars.
|
| 22 |
+
- In software development, tools like GitHub Copilot integrate LLMs to assist programmers
|
| 23 |
+
by generating code, commenting on functions, and detecting bugs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
- However, deploying LLMs effectively in real-world applications often requires
|
| 25 |
LLM integration. This means embedding these models into systems, workflows, or
|
| 26 |
products where they can interact with other components like databases, APIs, user
|
| 27 |
interfaces, or even custom business logic
|
| 28 |
+
- source_sentence: What is one educational tool mentioned that uses LLMs?
|
| 29 |
+
sentences:
|
| 30 |
- . As organizations increasingly adopt these technologies, the ability to understand
|
| 31 |
and apply LLMs will be a critical skill in the AI-powered future.
|
| 32 |
+
- '5. Education and Learning Platforms
|
| 33 |
+
|
| 34 |
+
Educational tools like Khanmigo (from Khan Academy) and other tutoring platforms
|
| 35 |
+
are leveraging LLMs to provide real-time help to students. LLMs can break down
|
| 36 |
+
complex topics, provide feedback on writing, and simulate Socratic-style dialogues.'
|
| 37 |
+
- '7. Enterprise Integrations
|
| 38 |
+
|
| 39 |
+
In enterprises, LLMs are being tied into internal systems like SharePoint, Slack,
|
| 40 |
+
Jira, and Confluence to act as knowledge assistants. Employees can ask natural
|
| 41 |
+
language questions like “What’s the latest update on Project Delta?” and get context-rich
|
| 42 |
+
answers based on internal documents and discussions.'
|
| 43 |
+
- source_sentence: Can the system retrieve documents even if the exact words weren't
|
| 44 |
+
used?
|
| 45 |
sentences:
|
| 46 |
+
- '7. Enterprise Integrations
|
| 47 |
+
|
| 48 |
+
In enterprises, LLMs are being tied into internal systems like SharePoint, Slack,
|
| 49 |
+
Jira, and Confluence to act as knowledge assistants. Employees can ask natural
|
| 50 |
+
language questions like “What’s the latest update on Project Delta?” and get context-rich
|
| 51 |
+
answers based on internal documents and discussions.'
|
| 52 |
+
- Companies are also experimenting with Retrieval-Augmented Generation (RAG)—a technique
|
| 53 |
+
where LLMs are paired with document databases (e.g., vector stores like Supabase,
|
| 54 |
+
Pinecone, or Weaviate) to answer questions with enterprise-specific knowledge.
|
| 55 |
+
- For instance, in a document management system, a user might type "policies about
|
| 56 |
+
sick leave", and the system—integrated with an LLM—could retrieve documents discussing
|
| 57 |
+
"medical leave", "employee absence", and "illness policies", even if those exact
|
| 58 |
+
words weren’t used.
|
| 59 |
+
- source_sentence: What are some techniques mentioned for mitigating challenges in
|
| 60 |
+
prompt engineering?
|
| 61 |
sentences:
|
| 62 |
+
- . These include text generation, summarization, translation, question answering,
|
| 63 |
+
code generation, and more.
|
| 64 |
+
- . These models are trained on massive text datasets and are capable of generating
|
| 65 |
+
coherent, context-aware language, answering questions, summarizing documents,
|
| 66 |
+
writing code, and more.
|
| 67 |
+
- 'Prompt Engineering: Designing effective prompts and interactions is a new and
|
| 68 |
+
still-evolving skill.
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
Mitigating these challenges often involves techniques like prompt tuning, fine-tuning,
|
| 72 |
+
hybrid search, caching, and using smaller models for certain tasks.
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
The Future of LLM Integrations
|
| 76 |
+
|
| 77 |
+
As LLMs evolve, we’ll see deeper and more seamless integration into everyday tools.
|
| 78 |
+
The future points to:'
|
| 79 |
+
- source_sentence: What are these models trained on?
|
| 80 |
+
sentences:
|
| 81 |
+
- Ultimately, the integration of LLMs across platforms, tools, and workflows is
|
| 82 |
+
transforming how we interact with information and machines—making software more
|
| 83 |
+
conversational, intelligent, and context-aware.
|
| 84 |
+
- . These models are trained on massive text datasets and are capable of generating
|
| 85 |
+
coherent, context-aware language, answering questions, summarizing documents,
|
| 86 |
+
writing code, and more.
|
| 87 |
- LLMs work by learning statistical relationships between words and phrases, allowing
|
| 88 |
them to predict and generate language that feels natural. The power of these models
|
| 89 |
lies not only in their size but also in the diversity of tasks they can perform
|
| 90 |
with little to no task-specific training
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
pipeline_tag: sentence-similarity
|
| 92 |
library_name: sentence-transformers
|
| 93 |
metrics:
|
|
|
|
| 117 |
type: dim_768
|
| 118 |
metrics:
|
| 119 |
- type: cosine_accuracy@1
|
| 120 |
+
value: 0.7333333333333333
|
| 121 |
name: Cosine Accuracy@1
|
| 122 |
- type: cosine_accuracy@3
|
| 123 |
+
value: 0.8
|
| 124 |
name: Cosine Accuracy@3
|
| 125 |
- type: cosine_accuracy@5
|
| 126 |
+
value: 0.8666666666666667
|
| 127 |
name: Cosine Accuracy@5
|
| 128 |
- type: cosine_accuracy@10
|
| 129 |
value: 1.0
|
| 130 |
name: Cosine Accuracy@10
|
| 131 |
- type: cosine_precision@1
|
| 132 |
+
value: 0.7333333333333333
|
| 133 |
name: Cosine Precision@1
|
| 134 |
- type: cosine_precision@3
|
| 135 |
+
value: 0.26666666666666666
|
| 136 |
name: Cosine Precision@3
|
| 137 |
- type: cosine_precision@5
|
| 138 |
+
value: 0.17333333333333337
|
| 139 |
name: Cosine Precision@5
|
| 140 |
- type: cosine_precision@10
|
| 141 |
+
value: 0.10000000000000003
|
| 142 |
name: Cosine Precision@10
|
| 143 |
- type: cosine_recall@1
|
| 144 |
+
value: 0.7333333333333333
|
| 145 |
name: Cosine Recall@1
|
| 146 |
- type: cosine_recall@3
|
| 147 |
+
value: 0.8
|
| 148 |
name: Cosine Recall@3
|
| 149 |
- type: cosine_recall@5
|
| 150 |
+
value: 0.8666666666666667
|
| 151 |
name: Cosine Recall@5
|
| 152 |
- type: cosine_recall@10
|
| 153 |
value: 1.0
|
| 154 |
name: Cosine Recall@10
|
| 155 |
- type: cosine_ndcg@10
|
| 156 |
+
value: 0.8434763926535543
|
| 157 |
name: Cosine Ndcg@10
|
| 158 |
- type: cosine_mrr@10
|
| 159 |
+
value: 0.7969312169312168
|
| 160 |
name: Cosine Mrr@10
|
| 161 |
- type: cosine_map@100
|
| 162 |
+
value: 0.7969312169312168
|
| 163 |
name: Cosine Map@100
|
| 164 |
- task:
|
| 165 |
type: information-retrieval
|
|
|
|
| 169 |
type: dim_512
|
| 170 |
metrics:
|
| 171 |
- type: cosine_accuracy@1
|
| 172 |
+
value: 0.7333333333333333
|
| 173 |
name: Cosine Accuracy@1
|
| 174 |
- type: cosine_accuracy@3
|
| 175 |
+
value: 0.8
|
| 176 |
name: Cosine Accuracy@3
|
| 177 |
- type: cosine_accuracy@5
|
| 178 |
+
value: 0.8666666666666667
|
| 179 |
name: Cosine Accuracy@5
|
| 180 |
- type: cosine_accuracy@10
|
| 181 |
value: 1.0
|
| 182 |
name: Cosine Accuracy@10
|
| 183 |
- type: cosine_precision@1
|
| 184 |
+
value: 0.7333333333333333
|
| 185 |
name: Cosine Precision@1
|
| 186 |
- type: cosine_precision@3
|
| 187 |
+
value: 0.26666666666666666
|
| 188 |
name: Cosine Precision@3
|
| 189 |
- type: cosine_precision@5
|
| 190 |
+
value: 0.17333333333333337
|
| 191 |
name: Cosine Precision@5
|
| 192 |
- type: cosine_precision@10
|
| 193 |
+
value: 0.10000000000000003
|
| 194 |
name: Cosine Precision@10
|
| 195 |
- type: cosine_recall@1
|
| 196 |
+
value: 0.7333333333333333
|
| 197 |
name: Cosine Recall@1
|
| 198 |
- type: cosine_recall@3
|
| 199 |
+
value: 0.8
|
| 200 |
name: Cosine Recall@3
|
| 201 |
- type: cosine_recall@5
|
| 202 |
+
value: 0.8666666666666667
|
| 203 |
name: Cosine Recall@5
|
| 204 |
- type: cosine_recall@10
|
| 205 |
value: 1.0
|
| 206 |
name: Cosine Recall@10
|
| 207 |
- type: cosine_ndcg@10
|
| 208 |
+
value: 0.8422851622170473
|
| 209 |
name: Cosine Ndcg@10
|
| 210 |
- type: cosine_mrr@10
|
| 211 |
+
value: 0.7957407407407406
|
| 212 |
name: Cosine Mrr@10
|
| 213 |
- type: cosine_map@100
|
| 214 |
+
value: 0.7957407407407406
|
| 215 |
name: Cosine Map@100
|
| 216 |
- task:
|
| 217 |
type: information-retrieval
|
|
|
|
| 224 |
value: 0.6666666666666666
|
| 225 |
name: Cosine Accuracy@1
|
| 226 |
- type: cosine_accuracy@3
|
| 227 |
+
value: 0.8
|
| 228 |
name: Cosine Accuracy@3
|
| 229 |
- type: cosine_accuracy@5
|
| 230 |
+
value: 0.8666666666666667
|
| 231 |
name: Cosine Accuracy@5
|
| 232 |
- type: cosine_accuracy@10
|
| 233 |
value: 1.0
|
|
|
|
| 236 |
value: 0.6666666666666666
|
| 237 |
name: Cosine Precision@1
|
| 238 |
- type: cosine_precision@3
|
| 239 |
+
value: 0.26666666666666666
|
| 240 |
name: Cosine Precision@3
|
| 241 |
- type: cosine_precision@5
|
| 242 |
+
value: 0.17333333333333337
|
| 243 |
name: Cosine Precision@5
|
| 244 |
- type: cosine_precision@10
|
| 245 |
+
value: 0.10000000000000003
|
| 246 |
name: Cosine Precision@10
|
| 247 |
- type: cosine_recall@1
|
| 248 |
value: 0.6666666666666666
|
| 249 |
name: Cosine Recall@1
|
| 250 |
- type: cosine_recall@3
|
| 251 |
+
value: 0.8
|
| 252 |
name: Cosine Recall@3
|
| 253 |
- type: cosine_recall@5
|
| 254 |
+
value: 0.8666666666666667
|
| 255 |
name: Cosine Recall@5
|
| 256 |
- type: cosine_recall@10
|
| 257 |
value: 1.0
|
| 258 |
name: Cosine Recall@10
|
| 259 |
- type: cosine_ndcg@10
|
| 260 |
+
value: 0.810143059320221
|
| 261 |
name: Cosine Ndcg@10
|
| 262 |
- type: cosine_mrr@10
|
| 263 |
+
value: 0.7524867724867724
|
| 264 |
name: Cosine Mrr@10
|
| 265 |
- type: cosine_map@100
|
| 266 |
+
value: 0.7524867724867724
|
| 267 |
name: Cosine Map@100
|
| 268 |
- task:
|
| 269 |
type: information-retrieval
|
|
|
|
| 273 |
type: dim_128
|
| 274 |
metrics:
|
| 275 |
- type: cosine_accuracy@1
|
| 276 |
+
value: 0.5333333333333333
|
| 277 |
name: Cosine Accuracy@1
|
| 278 |
- type: cosine_accuracy@3
|
| 279 |
+
value: 0.7333333333333333
|
| 280 |
name: Cosine Accuracy@3
|
| 281 |
- type: cosine_accuracy@5
|
| 282 |
+
value: 0.8666666666666667
|
| 283 |
name: Cosine Accuracy@5
|
| 284 |
- type: cosine_accuracy@10
|
| 285 |
+
value: 0.9333333333333333
|
| 286 |
name: Cosine Accuracy@10
|
| 287 |
- type: cosine_precision@1
|
| 288 |
+
value: 0.5333333333333333
|
| 289 |
name: Cosine Precision@1
|
| 290 |
- type: cosine_precision@3
|
| 291 |
+
value: 0.2444444444444445
|
| 292 |
name: Cosine Precision@3
|
| 293 |
- type: cosine_precision@5
|
| 294 |
+
value: 0.17333333333333337
|
| 295 |
name: Cosine Precision@5
|
| 296 |
- type: cosine_precision@10
|
| 297 |
+
value: 0.09333333333333335
|
| 298 |
name: Cosine Precision@10
|
| 299 |
- type: cosine_recall@1
|
| 300 |
+
value: 0.5333333333333333
|
| 301 |
name: Cosine Recall@1
|
| 302 |
- type: cosine_recall@3
|
| 303 |
+
value: 0.7333333333333333
|
| 304 |
name: Cosine Recall@3
|
| 305 |
- type: cosine_recall@5
|
| 306 |
+
value: 0.8666666666666667
|
| 307 |
name: Cosine Recall@5
|
| 308 |
- type: cosine_recall@10
|
| 309 |
+
value: 0.9333333333333333
|
| 310 |
name: Cosine Recall@10
|
| 311 |
- type: cosine_ndcg@10
|
| 312 |
+
value: 0.7245635799179159
|
| 313 |
name: Cosine Ndcg@10
|
| 314 |
- type: cosine_mrr@10
|
| 315 |
+
value: 0.6588888888888889
|
| 316 |
name: Cosine Mrr@10
|
| 317 |
- type: cosine_map@100
|
| 318 |
+
value: 0.6630555555555555
|
| 319 |
name: Cosine Map@100
|
| 320 |
- task:
|
| 321 |
type: information-retrieval
|
|
|
|
| 325 |
type: dim_64
|
| 326 |
metrics:
|
| 327 |
- type: cosine_accuracy@1
|
| 328 |
+
value: 0.4666666666666667
|
| 329 |
name: Cosine Accuracy@1
|
| 330 |
- type: cosine_accuracy@3
|
| 331 |
+
value: 0.6
|
| 332 |
name: Cosine Accuracy@3
|
| 333 |
- type: cosine_accuracy@5
|
| 334 |
+
value: 0.8
|
| 335 |
name: Cosine Accuracy@5
|
| 336 |
- type: cosine_accuracy@10
|
| 337 |
+
value: 0.8666666666666667
|
| 338 |
name: Cosine Accuracy@10
|
| 339 |
- type: cosine_precision@1
|
| 340 |
+
value: 0.4666666666666667
|
| 341 |
name: Cosine Precision@1
|
| 342 |
- type: cosine_precision@3
|
| 343 |
+
value: 0.2
|
| 344 |
name: Cosine Precision@3
|
| 345 |
- type: cosine_precision@5
|
| 346 |
+
value: 0.16000000000000003
|
| 347 |
name: Cosine Precision@5
|
| 348 |
- type: cosine_precision@10
|
| 349 |
+
value: 0.08666666666666668
|
| 350 |
name: Cosine Precision@10
|
| 351 |
- type: cosine_recall@1
|
| 352 |
+
value: 0.4666666666666667
|
| 353 |
name: Cosine Recall@1
|
| 354 |
- type: cosine_recall@3
|
| 355 |
+
value: 0.6
|
| 356 |
name: Cosine Recall@3
|
| 357 |
- type: cosine_recall@5
|
| 358 |
+
value: 0.8
|
| 359 |
name: Cosine Recall@5
|
| 360 |
- type: cosine_recall@10
|
| 361 |
+
value: 0.8666666666666667
|
| 362 |
name: Cosine Recall@10
|
| 363 |
- type: cosine_ndcg@10
|
| 364 |
+
value: 0.6490228576040539
|
| 365 |
name: Cosine Ndcg@10
|
| 366 |
- type: cosine_mrr@10
|
| 367 |
+
value: 0.58
|
| 368 |
name: Cosine Mrr@10
|
| 369 |
- type: cosine_map@100
|
| 370 |
+
value: 0.5892352092352092
|
| 371 |
name: Cosine Map@100
|
| 372 |
---
|
| 373 |
|
|
|
|
| 421 |
model = SentenceTransformer("Nuf-hugginface/modernbert-embed-quickb")
|
| 422 |
# Run inference
|
| 423 |
sentences = [
|
| 424 |
+
'What are these models trained on?',
|
| 425 |
+
'. These models are trained on massive text datasets and are capable of generating coherent, context-aware language, answering questions, summarizing documents, writing code, and more.',
|
| 426 |
+
'Ultimately, the integration of LLMs across platforms, tools, and workflows is transforming how we interact with information and machines—making software more conversational, intelligent, and context-aware.',
|
| 427 |
]
|
| 428 |
embeddings = model.encode(sentences)
|
| 429 |
print(embeddings.shape)
|
|
|
|
| 468 |
* Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
|
| 469 |
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
|
| 470 |
|
| 471 |
+
| Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
|
| 472 |
+
|:--------------------|:-----------|:-----------|:-----------|:-----------|:----------|
|
| 473 |
+
| cosine_accuracy@1 | 0.7333 | 0.7333 | 0.6667 | 0.5333 | 0.4667 |
|
| 474 |
+
| cosine_accuracy@3 | 0.8 | 0.8 | 0.8 | 0.7333 | 0.6 |
|
| 475 |
+
| cosine_accuracy@5 | 0.8667 | 0.8667 | 0.8667 | 0.8667 | 0.8 |
|
| 476 |
+
| cosine_accuracy@10 | 1.0 | 1.0 | 1.0 | 0.9333 | 0.8667 |
|
| 477 |
+
| cosine_precision@1 | 0.7333 | 0.7333 | 0.6667 | 0.5333 | 0.4667 |
|
| 478 |
+
| cosine_precision@3 | 0.2667 | 0.2667 | 0.2667 | 0.2444 | 0.2 |
|
| 479 |
+
| cosine_precision@5 | 0.1733 | 0.1733 | 0.1733 | 0.1733 | 0.16 |
|
| 480 |
+
| cosine_precision@10 | 0.1 | 0.1 | 0.1 | 0.0933 | 0.0867 |
|
| 481 |
+
| cosine_recall@1 | 0.7333 | 0.7333 | 0.6667 | 0.5333 | 0.4667 |
|
| 482 |
+
| cosine_recall@3 | 0.8 | 0.8 | 0.8 | 0.7333 | 0.6 |
|
| 483 |
+
| cosine_recall@5 | 0.8667 | 0.8667 | 0.8667 | 0.8667 | 0.8 |
|
| 484 |
+
| cosine_recall@10 | 1.0 | 1.0 | 1.0 | 0.9333 | 0.8667 |
|
| 485 |
+
| **cosine_ndcg@10** | **0.8435** | **0.8423** | **0.8101** | **0.7246** | **0.649** |
|
| 486 |
+
| cosine_mrr@10 | 0.7969 | 0.7957 | 0.7525 | 0.6589 | 0.58 |
|
| 487 |
+
| cosine_map@100 | 0.7969 | 0.7957 | 0.7525 | 0.6631 | 0.5892 |
|
| 488 |
|
| 489 |
<!--
|
| 490 |
## Bias, Risks and Limitations
|
|
|
|
| 504 |
|
| 505 |
#### Unnamed Dataset
|
| 506 |
|
| 507 |
+
* Size: 129 training samples
|
| 508 |
* Columns: <code>anchor</code> and <code>positive</code>
|
| 509 |
+
* Approximate statistics based on the first 129 samples:
|
| 510 |
+
| | anchor | positive |
|
| 511 |
+
|:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
|
| 512 |
+
| type | string | string |
|
| 513 |
+
| details | <ul><li>min: 8 tokens</li><li>mean: 13.8 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 53.68 tokens</li><li>max: 86 tokens</li></ul> |
|
| 514 |
* Samples:
|
| 515 |
+
| anchor | positive |
|
| 516 |
+
|:-----------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
| 517 |
+
| <code>What is the primary ability discussed in the text?</code> | <code>. This generalization ability makes them incredibly useful across industries—from customer service and education to software development and healthcare.</code> |
|
| 518 |
+
| <code>How many tasks are listed in the text?</code> | <code>. These include text generation, summarization, translation, question answering, code generation, and more.</code> |
|
| 519 |
+
| <code>What are examples of chatbot tools mentioned in the text?</code> | <code>1. Chatbots and Virtual Assistants<br>One of the most visible LLM integrations is in chatbots. Tools like ChatGPT, Claude, and Bard are themselves chatbot interfaces built on LLMs. Many businesses are now integrating these models into their websites and customer support systems.</code> |
|
| 520 |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
|
| 521 |
```json
|
| 522 |
{
|
|
|
|
| 675 |
</details>
|
| 676 |
|
| 677 |
### Training Logs
|
| 678 |
+
| Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|
| 679 |
+
|:-------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
|
| 680 |
+
| 1.0 | 5 | - | 0.8450 | 0.8003 | 0.8117 | 0.7009 | 0.6370 |
|
| 681 |
+
| 2.0 | 10 | 12.0802 | 0.8427 | 0.8222 | 0.8055 | 0.6979 | 0.6608 |
|
| 682 |
+
| **3.0** | **15** | **-** | **0.8435** | **0.8423** | **0.8101** | **0.7246** | **0.649** |
|
| 683 |
+
| 3.2424 | 16 | - | 0.8435 | 0.8423 | 0.8101 | 0.7246 | 0.6490 |
|
| 684 |
|
| 685 |
* The bold row denotes the saved checkpoint.
|
| 686 |
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 596070136
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:940b2d4ff23ed21583158f999073c662a4aef925a21b9ce34e0d4737565b9db3
|
| 3 |
size 596070136
|