SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/rag_tuned_minilm_mnr_10epoch")
# Run inference
sentences = [
    "How do the Stop and Delete IC Ops compare in terms of their effects on a run's state, visibility on the dashboard, resource usage, artifact preservation, and what further IC Ops can be performed on the run afterward?",
    'Stop\n----\n\nThis IC Op earmarks a run to be stopped at the end of its current chunk. \nIt will still be alive but it will not use any GPU resources from the next chunk. \nYou will still see its minibatch-level plots advancing for the current chunk. \nYou cannot stop an already stopped or deleted run. \n\n\n.. raw:: html\n\n    <img src="/ronit01/rag_tuned_minilm_mnr_10epoch/resolve/main/_static/icop-stop2.png" alt="IC Op Stop" \n         style="cursor: zoom-in; max-width: 100%;" onclick="this.requestFullscreen()">\n\n    <img src="/ronit01/rag_tuned_minilm_mnr_10epoch/resolve/main/_static/icop-stop.png" alt="IC Op Stop" \n         style="cursor: zoom-in; max-width: 100%;" onclick="this.requestFullscreen()">\n',
    'RapidFire AI offers a browser-based dashboard to automatically visualize all ML metrics and lets \nyou control runs on the fly from there. \nOur current default dashboard is a fork of the popular OSS tool `MLflow <https://mlflow.org/>`__, \nand it inherits much of MLflow\'s native features.\nThe dashboard URI is printed when the rapidfireai server is started; open it in a browser. \n\nAs of this writing, apart from MLflow, RapidFire AI also supports \n`TensorBoard  <https://www.tensorflow.org/tensorboard>`__\nand `Trackio <https://huggingface.co/docs/trackio/en/index>`__\nfor logging metrics plots. \nSpecify any one, two, or all three dashboards to use with the following server start argument. \n\n.. code-block:: bash\n\n   rapidfireai start --tracking-backends [mlflow | tensorboard | trackio]\n\nAlternatively, set the dashboard using its environment variable as below in your python code/notebook:\n\n.. code-block:: python\n\n   os.environ["RF_MLFLOW_ENABLED"] = "true"\n   os.environ["RF_TENSORBOARD_ENABLED"] = "true"\n   os.environ["RF_TRACKIO_ENABLED"] = "true"\n\nSupport for other popular dashboards such as Weights & Biases and CometML is coming soon. \nThe rest of this section explains the new features of our MLflow-fork dashboard.\nNote that these new features are not yet available on the other dashboards.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5636, 0.2072],
#         [0.5636, 1.0000, 0.2361],
#         [0.2072, 0.2361, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 46 training samples

  • Columns: sentence_0 and sentence_1

  • Approximate statistics based on the first 46 samples:

    sentence_0 sentence_1
    type string string
    details
    • min: 11 tokens
    • mean: 30.57 tokens
    • max: 48 tokens
    • min: 64 tokens
    • mean: 225.52 tokens
    • max: 256 tokens
  • Samples:

    sentence_0 sentence_1
    What user-provided functions can be included in an eval config for run_evals(), and which are mandatory vs. optional? API: User-Provided Functions for Run Evals
    ===============

    Users can provide the following custom functions as part of their eval config to be used in :func:run_evals().
    Note that each leaf config can have its own set of functions for all of these.


    Preprocess Function
    -------------------

    Mandatory user-provided function to prepare the inputs to be given to the generator model.
    It is invoked for each batch during the evaluation process before generation.
    Pass it directly to the :code:preprocess_fn key in your eval config dictionary.

    The system injects into this function the batch data, as well as the RAG spec and
    the prompt manager of an individual leaf config.


    .. py:function:: preprocess_fn(batch: dict[str, list], rag: RFLangChainRagSpec, prompt_manager: RFPromptManager) -> dict[str, list]

    :param batch: Dictionary with a batch of examples with dataset field names as keys and lists as values
    How do I set up RapidFire AI for RAG evaluation on a machine without GPUs? Note that if you plan to use only OpenAI APIs and not self-hosted models (for embedding or generation), you do NOT need GPUs on your machine.
    But you must provide a valid OpenAI API key via a config argument as shown in the GSM8K and SciFact tutorial notebooks.

    Step 1: Install dependencies and package

    Obtain the RapidFire AI OSS package from pypi (includes all dependencies) and ensure it is installed correctly.

    .. important::

    Requires Python 3.12+. Ensure that python3 resolves to Python 3.12 before creating the venv.

    .. code-block:: bash

    python3 --version # must be 3.12.x python3 -m venv .venv source .venv/bin/activate

    pip install rapidfireai

    rapidfireai --version

    Verify it prints the following:

    RapidFire AI 0..14.0

    Due to current issue: https://github.com/huggingface/xet-core/issues/527

    pip uninstall -y hf-xet

    The tutorial notebooks for RAG evals do not use any gated models from Hugging Face. If you want to a... | | What are the four Interactive Control (IC) Operations supported by RapidFire AI? | As of this writing, we support 4 IC Ops: Stop, Resume, Clone-Modify, and Delete. We explain each shortly below.

    All IC Ops on a run are queued by the system and executed at a chunk boundary for that run. This avoids potentially non-deterministic or other inconsistent behaviors during concurrent run execution. Note that different runs might reach their chunk boundary at different points in time. To control the number of chunks, set :code:num_chunks during :func:run_fit(); more details :doc:on the Experiment docs page </experiments>.

    IC ops can be invoked as intermittently as you want during a long-running :func:run_fit(). So, you can launch, say, 16 configs in one go (even on a 4-GPU machine), check in after a few chunks,
    and stop bottom 80% of the runs. You can let the top performers continue for longer. Then you can clone and modify some to add new finer grained runs and warm start their parameters. And so on.

    Under the hood, RapidFire AI automat... |

  • Loss: MultipleNegativesRankingLoss with these parameters:

    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false,
        "directions": [
            "query_to_doc"
        ],
        "partition_mode": "joint",
        "hardness_mode": null,
        "hardness_strength": 0.0
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Time

  • Training: 4.9 seconds

Framework Versions

  • Python: 3.12.13
  • Sentence Transformers: 5.4.1
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}
Downloads last month
6
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ronit01/rag_tuned_minilm_mnr_10epoch

Papers for ronit01/rag_tuned_minilm_mnr_10epoch