Instructions to use ronit01/rag_tuned_minilm_10 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ronit01/rag_tuned_minilm_10 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("ronit01/rag_tuned_minilm_10")

sentences = [
    "How does RFDPOConfig differ from RFGRPOConfig in the trainer configuration?",
    "    :param document_template: Optional function to format each retrieved chunk for context injection into prompts. Should accept a single LangChain :class:`Document` object and return a formatted string. Multiple documents are separated by double newlines when serialized. If not provided, the following default template is used:\n \n        .. code-block:: python\n \n            def default_template(doc: Document) -> str:\n                \"\"\"Default document formatting template.\"\"\"\n                metadata = \"; \".join([f\"{k}: {v}\" for k, v in doc.metadata.items()])\n                return f\"{metadata}:\\n{doc.page_content}\"\n \n        You can provide a custom template to control what metadata fields are included and how the content is formatted. For example, to include only a specific metadata field:\n \n        .. code-block:: python\n \n            def sample_template(doc: Document) -> str:\n                doc_source = doc.metadata.get(\"source\", \"\")\n                return f\"Document Source: {doc_source}:\\nContent: {doc.page_content}\"\n \n        Or for a dataset like SciFact where documents have a :code:`\"title\"` metadata field ingested via :code:`metadata_func` in the document loader:\n \n        .. code-block:: python\n \n            def custom_template(doc: Document) -> str:\n                return f\"{doc.metadata['title']}: {doc.page_content}\"\n \n    :type document_template: Callable[[Document], str], optional",
    "Recovering Storage Space\n-------\n\nIf you run out of storage space on your machine due to experimenting with lots of LLMs, we \nrecommend clearing out the \".cache\" folder on your home directory that is created by \nHugging Face to import the base models. \nOne experiment's imported models are not needed for another; so, it is safe to delete them.\n\nIf you want to reclaim even more space, look at the artifacts from your experiments and \neither delete some of the files or move them to other/remote storage. \nNote that when you use LoRA adapters, RapidFire AI saves only the trained adapters in the \ncheckpoints of the runs, not the base models.",
    ".. py:function:: get_runs_info(self) -> pd.DataFrame:\n\n\t:return: A DataFrame with the following columns: run_id, status, mlflow_run_id, completed_steps, total_steps, start_chunk_id, num_chunks_visited_curr_epoch, num_epochs_completed, error, source, ended_by, warm_started_from, config (full config dictionary)\n\n\t:rtype: pandas.DataFrame"
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity
Supported Modality: Text

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/rag_tuned_minilm_10")
# Run inference
sentences = [
    'What are the two knob set generators currently supported by RapidFire AI for creating multi-config specifications?',
    'We currently support two common knob set generators: :func:`List()` for a discrete \nset of values and :func:`Range()` for sampling from a continuous value interval.\n\n\n.. py:function:: List(values: List[Any])\n\n\t:param values: List of discrete values for a knob; all values must be the same python data type.\n\t:type values: List[Any]\n\n\n.. py:function:: Range(start: int | float, end: int | float, dtype: str = "int" | "float")\n\n\t:param start: Lower bound of range interval.\n\t:type start: int | float\n\n\t:param end: Upper bound of range interval.\n\t:type end: int | float\n\n\t:param dtype: Data type of value to be sampled, either :code:`"int"` or :code:`"float"`.\n\t:type dtype: str\n\n\n**Notes:**\n\nAs of this writing, :func:`Range()` performs uniform sampling within the given interval. \nWe plan to continue expanding this API and add more functionality on this front based on feedback.',
    'Note that if you plan to use only OpenAI APIs and not self-hosted models (for embedding or generation), you do NOT need GPUs on your machine. \nBut you must provide a valid OpenAI API key via a config argument as shown in the GSM8K and SciFact tutorial notebooks.\n\n\nStep 1: Install dependencies and package\n-----------------------\n\nObtain the RapidFire AI OSS package from pypi (includes all dependencies) and ensure it is installed correctly.\n\n.. important::\n\n  Requires Python 3.12+. Ensure that ``python3`` resolves to Python 3.12 before creating the venv.\n\n.. code-block:: bash\n\n   python3 --version  # must be 3.12.x\n   python3 -m venv .venv\n   source .venv/bin/activate\n\n   pip install rapidfireai\n\n   rapidfireai --version\n   # Verify it prints the following:\n   # RapidFire AI 0..14.0\n\n   # Due to current issue: https://github.com/huggingface/xet-core/issues/527\n   pip uninstall -y hf-xet\n\n\nThe tutorial notebooks for RAG evals do not use any gated models from Hugging Face.\nIf you want to access gated models, provide your Hugging Face account token.\nFor more details on that, :doc:`see Step 1 here</walkthroughft>`.\n\n\nStep 2: Initialize and start RapidFire AI server\n------------\n\nRun the following commands to initialize rapidfireai to use the correct dependencies for RAG evals:\n\n.. code-block:: bash\n\n   rapidfireai init --evals\n   # It will install specific dependencies and initialize rapidfireai for RAG evals\n\n\n.. note::\n  You need to run init **only once** for a new venv or when switching GPU(s) on your machine. You do NOT need to run it after a reboot or for a new terminal tab.\n\n\nNext start RapidFire AI services: the frontend with the ML metrics dashboard and the API server. \nThe frontend URL shown below can be opened on your local browser.\n\n.. code-block:: bash\n\n   rapidfireai start\n   # It should print about 50 lines, including the following:\n   # ...\n   # RapidFire Frontend is ready\n   # Open your browser and navigate to: http://0.0.0.0:8853\n   # ...\n   # Press Ctrl+C to stop all services\n\n.. important::\n\n  Do NOT proceed until the start is successful with "Available endpoints" printed as above. Leave this terminal running while you work through the tutorial notebooks. \n\n\nIf you close the terminal in which you started rapidfireai or if you rebooted your machine, \njust start rapidfireai again with the above command.\n\nIf the start command fails for whatever reason, wait for half a minute and rerun it.\nFor diagnostics and common fixes (including Linux/macOS and Windows steps), see :doc:`Troubleshooting </troubleshooting>`.\n\n.. note::\n  For RAG/context engineering experiments with :func:`run_evals()`, starting the server is **optional** and only needed if you want to see results on the ML metrics dashboard too. Just as results are shown in an in-notebook table too, IC Ops panel can be displayed in the notebook too, as illustrated below (Steps 5 and 6).',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8854, 0.4689],
#         [0.8854, 1.0000, 0.3173],
#         [0.4689, 0.3173, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 208 training samples
Columns: sentence_0, sentence_1, and label
Approximate statistics based on the first 208 samples:
sentence_0 sentence_1 label
type string string float
details
min: 11 tokens
mean: 24.87 tokens
max: 34 tokens

min: 31 tokens
mean: 222.9 tokens
max: 256 tokens

min: 0.0
mean: 0.25
max: 1.0
Samples: | sentence_0 | sentence_1 | label | |:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------| | What arguments does the RFModelConfig class accept for defining a model configuration in RapidFire AI? | Recovering Storage Space ------- If you run out of storage space on your machine due to experimenting with lots of LLMs, we recommend clearing out the ".cache" folder on your home directory that is created by Hugging Face to import the base models. One experiment's imported models are not needed for another; so, it is safe to delete them. If you want to reclaim even more space, look at the artifacts from your experiments and either delete some of the files or move them to other/remote storage. Note that when you use LoRA adapters, RapidFire AI saves only the trained adapters in the checkpoints of the runs, not the base models. | 0.0 | | How do reward functions work in RapidFire AI for GRPO training, and what arguments does TRL inject into them? | Multi-GPU Model Partitioning with FSDP
RapidFire AI supports automated large model partitioning across GPUs (on the same machine) via PyTorch's native FSDP. Provide the relevant FSDP deatils in a config knob, optionally along with the number of GPUs to use for that run. The following notebooks showcase the use of FSDP for SFT with the corresponding LLMs:
- FSDP Large with base model Llama-3-70B-Instruct. Needs at least 8x A10 GPUs or equivalent (192 GB total HBM) to work.... | 0.0 | | What are the three use case tutorials provided for RAG and context engineering, and what type of workflow does each demonstrate? | This use case notebook features an all-closed model API workflow, with Open AI calls used for both embedding for generation. So, you do not need a GPU to run this notebook. | 1.0 |

	sentence_0	sentence_1	label
type	string	string	float
details	min: 11 tokens mean: 24.87 tokens max: 34 tokens	min: 31 tokens mean: 222.9 tokens max: 256 tokens	min: 0.0 mean: 0.25 max: 1.0

Loss: ContrastiveLoss with these parameters:

{
    "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
    "margin": 0.5,
    "size_average": true
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 10
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 10
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}

Training Time

Training: 24.5 seconds

Framework Versions

Python: 3.12.13
Sentence Transformers: 5.4.1
Transformers: 5.0.0
PyTorch: 2.10.0+cu128
Accelerate: 1.13.0
Datasets: 4.0.0
Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}