---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:3312
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: nomic-ai/modernbert-embed-base
widget:
- source_sentence: S3 buckets. Before creating a bucket, make sure that you choose
    the bucket type that best ﬁts your application and performance requirements. For
    more information about the various bucket types and the appropriate use cases
    for each, see Buckets. The following sections provide more information about general
    purpose buckets, including bucket naming rules, quotas, and bucket conﬁguration
    details. For a list of restriction and limitations related to Amazon S3 buckets
    see, General purpose bucket quotas, limitations, and restrictions. Topics • General
    purpose buckets overview • Common general purpose bucket patterns • Permissions
    • Managing public access to general purpose buckets • General purpose buckets
    conﬁguration options • General purpose buckets operations General purpose buckets
    overview API Version 2006-03-01 53 Amazon
  sentences:
  - What should you test for your DB instance?
  - Where can you find a list of restrictions and limitations related to Amazon S3
    buckets?
  - What does the 'Get started' section provide?
- source_sentence: geographies use DynamoDB to build modern, serverless applications
    that can start small and scale globally. DynamoDB scales to support tables of
    virtually any size while providing consistent single-digit millisecond performance
    and high availability. For events, such as Amazon Prime Day, DynamoDB powers multiple
    high-traﬃc Amazon properties and systems, including Alexa, Amazon.com sites, and
    all Amazon fulﬁllment centers. For such events, DynamoDB APIs have handled trillions
    of calls from Amazon properties and systems. DynamoDB continuously serves hundreds
    of customers with tables that have peak traﬃc of over half a million requests
    per second. It also serves hundreds of customers whose table sizes exceed 200
    TB, and processes over one billion requests per hour. Topics • Characteristics
    of DynamoDB • DynamoDB use
  sentences:
  - What ensures that tasks are always started on secure and patched infrastructure?
  - What state is the environment in while Elastic Beanstalk creates your AWS resources?
  - What is the peak traffic that DynamoDB serves for some customers?
- source_sentence: Amazon Bedrock? 1 Amazon Bedrock User Guide • Create applications
    that reason through how to help a customer – Build agents that use foundation
    models, make API calls, and (optionally) query knowledge bases in order to reason
    through and carry out tasks for your customers. • Adapt models to speciﬁc tasks
    and domains with training data – Customize an Amazon Bedrock foundation model
    by providing training data for ﬁne-tuning or continued-pretraining in order to
    adjust a model's parameters and improve its performance on speciﬁc tasks or in
    certain domains. • Improve your FM-based application's eﬃciency and output – Purchase
    Provisioned Throughput for a foundation model in order to run inference on models
    more eﬃciently and at discounted rates. • Determine
  sentences:
  - How can you access Amazon API Gateway?
  - What allocation strategy is recommended for Spot best practice?
  - What is the purpose of adapting models to specific tasks and domains?
- source_sentence: 'you create the example application, Elastic Beanstalk creates
    the following resources: • EC2 instance – An Amazon EC2 virtual machine conﬁgured
    to run web apps on the platform you selected. Every platform runs a diﬀerent set
    of software, conﬁguration ﬁles, and scripts to support a speciﬁc language version,
    framework, web container, or combination thereof. Most platforms use either Apache
    or nginx as a reverse proxy to forward web traﬃc to your web app, serve static
    assets, and generate access and error logs. You can connect to your Amazon EC2
    instances to view conﬁguration and logs. Step 2 - Deploy your application 10 AWS
    Elastic Beanstalk Developer Guide • Instance security group – An Amazon EC2 security
    group will be created'
  sentences:
  - What allows a client to securely access private API resources inside a VPC?
  - What resources does Elastic Beanstalk create when you create the example application?
  - Where can you find more information about using ACLs?
- source_sentence: change). Saved conﬁguration A saved conﬁguration is a template
    that you can use as a starting point for creating unique environment conﬁgurations.
    You can create and modify saved conﬁgurations, and apply them to environments,
    using the Elastic Beanstalk console, EB CLI, AWS CLI, or API. The API and the
    AWS CLI refer to saved conﬁgurations as conﬁguration templates. Platform A platform
    is a combination of an operating system, programming language runtime, web server,
    application server, and Elastic Beanstalk components. You design and target your
    web application to a platform. Elastic Beanstalk provides a variety of platforms
    on which you can build your applications. For details, see Elastic Beanstalk platforms.
    Elastic Beanstalk web server environments The following diagram shows an example
  sentences:
  - What can you grant other people permission to do in your AWS account?
  - How can the fleet request be deleted?
  - What do the API and the AWS CLI refer to saved configurations as?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: Embed AWS Docs
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 768
      type: dim_768
    metrics:
    - type: cosine_accuracy@1
      value: 0.002717391304347826
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.22554347826086957
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5081521739130435
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6983695652173914
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.002717391304347826
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.07518115942028984
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.10163043478260869
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.06983695652173912
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.002717391304347826
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.22554347826086957
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5081521739130435
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6983695652173914
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.30319890292610013
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.18024823153899258
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.1931834404953386
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 512
      type: dim_512
    metrics:
    - type: cosine_accuracy@1
      value: 0.0
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.17119565217391305
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.49728260869565216
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6766304347826086
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.0
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.057065217391304345
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.09945652173913044
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.06766304347826087
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.0
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.17119565217391305
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.49728260869565216
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6766304347826086
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.2883913649143213
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.16803291062801948
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.18227351655190474
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 256
      type: dim_256
    metrics:
    - type: cosine_accuracy@1
      value: 0.008152173913043478
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.1875
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.4945652173913043
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6657608695652174
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.008152173913043478
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.06249999999999999
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.09891304347826087
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.06657608695652174
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.008152173913043478
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.1875
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.4945652173913043
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6657608695652174
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.28990281751237307
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.17309459109730865
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.18770616923880445
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 128
      type: dim_128
    metrics:
    - type: cosine_accuracy@1
      value: 0.002717391304347826
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.1875
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.44021739130434784
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.5842391304347826
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.002717391304347826
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.0625
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.08804347826086956
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.058423913043478264
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.002717391304347826
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.1875
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.44021739130434784
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.5842391304347826
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.25437162359674753
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.151576518288475
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.16929779832410816
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 64
      type: dim_64
    metrics:
    - type: cosine_accuracy@1
      value: 0.008152173913043478
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.15760869565217392
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.33695652173913043
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.4782608695652174
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.008152173913043478
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.05253623188405797
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.0673913043478261
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.047826086956521734
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.008152173913043478
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.15760869565217392
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.33695652173913043
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.4782608695652174
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.2095240678369969
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.12627782091097317
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.14429296766748773
      name: Cosine Map@100
---

# Embed AWS Docs

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
- **Maximum Sequence Length:** 8192 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - json
- **Language:** en
- **License:** apache-2.0

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("CadenShokat/modernbert-embed-aws")
# Run inference
sentences = [
    'change). Saved conﬁguration A saved conﬁguration is a template that you can use as a starting point for creating unique environment conﬁgurations. You can create and modify saved conﬁgurations, and apply them to environments, using the Elastic Beanstalk console, EB CLI, AWS CLI, or API. The API and the AWS CLI refer to saved conﬁgurations as conﬁguration templates. Platform A platform is a combination of an operating system, programming language runtime, web server, application server, and Elastic Beanstalk components. You design and target your web application to a platform. Elastic Beanstalk provides a variety of platforms on which you can build your applications. For details, see Elastic Beanstalk platforms. Elastic Beanstalk web server environments The following diagram shows an example',
    'What do the API and the AWS CLI refer to saved configurations as?',
    'What can you grant other people permission to do in your AWS account?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5572, 0.1425],
#         [0.5572, 1.0000, 0.1790],
#         [0.1425, 0.1790, 1.0000]])
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Dataset: `dim_768`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
  ```json
  {
      "truncate_dim": 768
  }
  ```

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0027     |
| cosine_accuracy@3   | 0.2255     |
| cosine_accuracy@5   | 0.5082     |
| cosine_accuracy@10  | 0.6984     |
| cosine_precision@1  | 0.0027     |
| cosine_precision@3  | 0.0752     |
| cosine_precision@5  | 0.1016     |
| cosine_precision@10 | 0.0698     |
| cosine_recall@1     | 0.0027     |
| cosine_recall@3     | 0.2255     |
| cosine_recall@5     | 0.5082     |
| cosine_recall@10    | 0.6984     |
| **cosine_ndcg@10**  | **0.3032** |
| cosine_mrr@10       | 0.1802     |
| cosine_map@100      | 0.1932     |

#### Information Retrieval

* Dataset: `dim_512`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
  ```json
  {
      "truncate_dim": 512
  }
  ```

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0        |
| cosine_accuracy@3   | 0.1712     |
| cosine_accuracy@5   | 0.4973     |
| cosine_accuracy@10  | 0.6766     |
| cosine_precision@1  | 0.0        |
| cosine_precision@3  | 0.0571     |
| cosine_precision@5  | 0.0995     |
| cosine_precision@10 | 0.0677     |
| cosine_recall@1     | 0.0        |
| cosine_recall@3     | 0.1712     |
| cosine_recall@5     | 0.4973     |
| cosine_recall@10    | 0.6766     |
| **cosine_ndcg@10**  | **0.2884** |
| cosine_mrr@10       | 0.168      |
| cosine_map@100      | 0.1823     |

#### Information Retrieval

* Dataset: `dim_256`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
  ```json
  {
      "truncate_dim": 256
  }
  ```

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0082     |
| cosine_accuracy@3   | 0.1875     |
| cosine_accuracy@5   | 0.4946     |
| cosine_accuracy@10  | 0.6658     |
| cosine_precision@1  | 0.0082     |
| cosine_precision@3  | 0.0625     |
| cosine_precision@5  | 0.0989     |
| cosine_precision@10 | 0.0666     |
| cosine_recall@1     | 0.0082     |
| cosine_recall@3     | 0.1875     |
| cosine_recall@5     | 0.4946     |
| cosine_recall@10    | 0.6658     |
| **cosine_ndcg@10**  | **0.2899** |
| cosine_mrr@10       | 0.1731     |
| cosine_map@100      | 0.1877     |

#### Information Retrieval

* Dataset: `dim_128`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
  ```json
  {
      "truncate_dim": 128
  }
  ```

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0027     |
| cosine_accuracy@3   | 0.1875     |
| cosine_accuracy@5   | 0.4402     |
| cosine_accuracy@10  | 0.5842     |
| cosine_precision@1  | 0.0027     |
| cosine_precision@3  | 0.0625     |
| cosine_precision@5  | 0.088      |
| cosine_precision@10 | 0.0584     |
| cosine_recall@1     | 0.0027     |
| cosine_recall@3     | 0.1875     |
| cosine_recall@5     | 0.4402     |
| cosine_recall@10    | 0.5842     |
| **cosine_ndcg@10**  | **0.2544** |
| cosine_mrr@10       | 0.1516     |
| cosine_map@100      | 0.1693     |

#### Information Retrieval

* Dataset: `dim_64`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
  ```json
  {
      "truncate_dim": 64
  }
  ```

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0082     |
| cosine_accuracy@3   | 0.1576     |
| cosine_accuracy@5   | 0.337      |
| cosine_accuracy@10  | 0.4783     |
| cosine_precision@1  | 0.0082     |
| cosine_precision@3  | 0.0525     |
| cosine_precision@5  | 0.0674     |
| cosine_precision@10 | 0.0478     |
| cosine_recall@1     | 0.0082     |
| cosine_recall@3     | 0.1576     |
| cosine_recall@5     | 0.337      |
| cosine_recall@10    | 0.4783     |
| **cosine_ndcg@10**  | **0.2095** |
| cosine_mrr@10       | 0.1263     |
| cosine_map@100      | 0.1443     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### json

* Dataset: json
* Size: 3,312 training samples
* Columns: <code>positive</code> and <code>anchor</code>
* Approximate statistics based on the first 1000 samples:
  |         | positive                                                                            | anchor                                                                            |
  |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
  | type    | string                                                                              | string                                                                            |
  | details | <ul><li>min: 8 tokens</li><li>mean: 156.95 tokens</li><li>max: 265 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 13.51 tokens</li><li>max: 32 tokens</li></ul> |
* Samples:
  | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | anchor                                                                                                         |
  |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------|
  | <code>such as the Kubernetes Dashboard and the section called “Horizontal Pod Autoscaler”. In this topic you learn how to install the Metrics Server. • the section called “Deploy apps with Helm” – The Helm package manager for Kubernetes helps you install and manage applications on your Kubernetes cluster. This topic helps you install and run the Helm binaries so that you can install and manage charts using the Helm CLI on your local computer. • the section called “Tagging your resources” – To help you manage your Amazon EKS resources, you can assign your own metadata to each resource in the form of tags. This topic describes tags and shows you how to create them. • the section called “Service</code>                                                                          | <code>What is the section called that helps you install the Metrics Server?</code>                             |
  | <code>out orchestrations through cyclically interpreting inputs and producing outputs by using a foundation model. An agent can be used to carry out customer requests. For more information, see Automate tasks in your application using AI agents. • Retrieval augmented generation (RAG) – The process involves: 1. Querying and retrieving information from a data source 2. Augmenting a prompt with this information to provide better context to the foundation model 3. Obtaining a better response from the foundation model using the additional context For more information, see Retrieve data and generate AI responses with Amazon Bedrock Knowledge Bases. • Model customization – The process of using training data to adjust the model parameter values in a base model in order to</code> | <code>Where can you find more information about AI agents?</code>                                              |
  | <code>An application that allows your customers to register, discover, and subscribe to your API products (API Gateway usage plans), manage their API keys, and view their usage metrics for your APIs. Edge-optimized API endpoint The default hostname of an API Gateway API that is deployed to the speciﬁed Region while using a CloudFront distribution to facilitate client access typically from across AWS Regions. API API Gateway concepts 9 Amazon API Gateway Developer Guide requests are routed to the nearest CloudFront Point of Presence (POP), which typically improves connection time for geographically diverse clients. See API endpoints. Integration request The internal interface of a WebSocket API route or REST API method in API Gateway, in which you map the body of</code>   | <code>What is the internal interface of a WebSocket API route or REST API method in API Gateway called?</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
  ```json
  {
      "loss": "MultipleNegativesRankingLoss",
      "matryoshka_dims": [
          768,
          512,
          256,
          128,
          64
      ],
      "matryoshka_weights": [
          1,
          1,
          1,
          1,
          1
      ],
      "n_dims_per_step": -1
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: epoch
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 16
- `gradient_accumulation_steps`: 16
- `learning_rate`: 2e-05
- `num_train_epochs`: 4
- `lr_scheduler_type`: cosine
- `warmup_ratio`: 0.1
- `tf32`: False
- `load_best_model_at_end`: True
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: epoch
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 16
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 4
- `max_steps`: -1
- `lr_scheduler_type`: cosine
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: False
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}

</details>

### Training Logs
| Epoch   | Step   | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|:-------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
| 1.0     | 7      | -             | 0.2693                 | 0.2644                 | 0.2627                 | 0.2275                 | 0.1783                |
| 1.4615  | 10     | 5.1989        | -                      | -                      | -                      | -                      | -                     |
| 2.0     | 14     | -             | 0.2949                 | 0.2901                 | 0.2832                 | 0.2446                 | 0.1976                |
| 2.9231  | 20     | 2.6407        | -                      | -                      | -                      | -                      | -                     |
| 3.0     | 21     | -             | 0.3075                 | 0.2905                 | 0.2876                 | 0.2504                 | 0.2081                |
| **4.0** | **28** | **-**         | **0.3032**             | **0.2884**             | **0.2899**             | **0.2544**             | **0.2095**            |

* The bold row denotes the saved checkpoint.

### Framework Versions
- Python: 3.10.18
- Sentence Transformers: 5.1.0
- Transformers: 4.55.2
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.0
- Datasets: 4.0.0
- Tokenizers: 0.21.4

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->