Instructions to use ronit01/final_golden_rag_tuned_minilm_100 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ronit01/final_golden_rag_tuned_minilm_100 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("ronit01/final_golden_rag_tuned_minilm_100")

sentences = [
    "How do you set up and run an SFT fine-tuning experiment from scratch using RapidFire AI's full installation, from installing the package through launching training and monitoring results?",
    "Step 4: Run the notebook cells\n-------\n\nRun the cells *one by one* as shown in the above videos. Wait for a cell to finish before running the next.\n\n* Imports\n* Load dataset and specify train and eval partitions\n\n  If you want to run the notebook faster for demo purposes, downsample the data further as per your wish. Here are some suggested reductions. You can also reduce effective batch size by reducing either or both of :code:`per_device_train_batch_size` and :code:`gradient_accumulation_steps` in the trainer configs.\n\n  * SFT notebook: \n     .. code-block:: python\n\n        train_dataset=dataset[\"train\"].select(range(128)) # 128 instead of 5000\n        eval_dataset=dataset[\"train\"].select(range(5000,5032)) # 5032 instead of 5200\n\n  * DPO notebook: \n     .. code-block:: python\n\n        select(range(128)) # 128 instead of 500\n\n  * GRPO notebook: \n     .. code-block:: python\n\n        train_dataset = get_gsm8k_questions(split=\"train\").select(range(128)) # 128 instead of 5000\n        eval_dataset = get_gsm8k_questions(split=\"test\").select(range(32)) # 32 instead of 100\n\n* Define example processing function\n\n* Create named RF experiment \n\n* Define custom eval metrics function\n\n* Define multi-config knobs for model, LoRA, and SFT Trainer using RapidFire AI wrapper APIs\n\n* Define model creation function for all model types across configs\n\n* Generate config group you want to compare in one go\n\n* Launch multi-config training; adjust :code:`num_chunks` as per desired concurrency (see `Run Fit <experiment.html#run-fit>`__ for details)\n\n  .. code-block:: python\n\n     # Launch training of all configs in the config_group with swap granularity of 4 chunks\n     experiment.run_fit(config_group, sample_create_model, train_dataset, eval_dataset, num_chunks=4, seed=42)\n\n  Note that in the same experiment, you can run as many :func:`run_fit()` as you want. All their runs will be superimposed on the same plots on the dashboard.\n",
    "RFOpenAIAPIModelConfig\n------\n\nThis is a wrapper around OpenAI's API client config and chat completion parameters. \nThe full list of their arguments are available on `this page <https://platform.openai.com/docs/api-reference/chat/create>`__.\n\nThe difference here is that the individual arguments (knobs) can be :class:`List` valued or \n:class:`Range` valued in an :class:`RFOpenAIAPIModelConfig`. \nThat is how you can specify a base set of knob combinations from which a config group can \nbe produced. Also read :doc:`the Multi-Config Specification page</configs>`.\n\n.. py:class:: RFOpenAIAPIModelConfig\n\n  :param client_config: A dictionary necessary for initializing the AsyncOpenAI client. All knobs given in this dictionary are simply passed to the AsyncOpenAI client as is. We recommend listing at least the following knobs.\n  \n    * :code:`\"api_key\"`: Your OpenAI API key for authentication. Note that we are NOT able to provide a publicly visible API key.\n    * :code:`\"max_retries\"`: Maximum number of retry attempts for failed API calls. Default is 2.\n    * :code:`\"timeout\"`: Request timeout in seconds. Optional.\n    \n  :type client_config: dict[str, Any]\n\n  :param model_config: A dictionary to control the chat completion behavior with OpenAI's Chat Completions API. All knobs given in this dictionary are simply passed to the OpenAI API as is. The API will use its defaults for unspecified knobs. We recommend listing at least the following knobs.\n\n    * :code:`\"model\"`: Name of the OpenAI model to use, e.g., \"gpt-5-mini\".\n    * :code:`\"temperature\"`: What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. OpenAI recommends altering this or :code:`\"top_p\"` but not both.\n    * :code:`\"max_completion_tokens\"`: Upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.\n    * :code:`\"reasoning_effort\"`: Constrains effort for reasoning models. Currently supported values are \"minimal\", \"low\", \"medium\", and \"high\". Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. The gpt-5-pro model defaults to (and only supports) \"high\" reasoning effort.\n\t* :code:`\"top_p\"`: Alternative to temperature-based sampling called nucleus sampling. The model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. OpenAI recommends altering this or :code:`\"temperature\"` but not both.\n\n  :type model_config: dict[str, Any]\n\n  :param rpm_limit: Rate limit for requests per minute to the OpenAI API. Used for throttling to avoid exceeding Open AI API quotas. Check the rate limit published by Open AI for details on your tier and the latest per-model limits on `this page <https://platform.openai.com/docs/guides/rate-limits>`__.\n  :type rpm_limit: int\n\n  :param tpm_limit: Rate limit for tokens per minute to the OpenAI API. Used for throttling to avoid exceeding API quotas. See the rate limit page above for details.\n  :type tpm_limit: int\n\n  :param rag: An instance of a RapidFire AI RAG pipeline spec. Also read :doc:`the API: RFLangChainRagSpec page </ragspecs>`.\n  :type rag: RFLangChainRagSpec\n\n  :param prompt_manager: An instance of a RapidFire AI PromptManager. Also read :doc:`the API: Prompt Manager and Other Eval Config Knobs page </promptothers>`.\n  :type prompt_manager: PromptManager\n\n  .. seealso::\n     - `OpenAI Chat Completions API Reference <https://platform.openai.com/docs/api-reference/chat/create>`_\n     - `OpenAI Python Client Documentation <https://github.com/openai/openai-python>`_\n     - :doc:`API: RFLangChainRagSpec </ragspecs>`\n     - :doc:`API: Prompt Manager and Other Eval Config Knobs page </promptothers>`\n\n\n**Example:**\n\n.. code-block:: python\n\n\t# Based on GSM8K chatbot tutorial notebook; specify your OPENAI_API_KEY beforehand\n\topenai_config1 = RFOpenAIAPIModelConfig(\n\t\tclient_config={\"api_key\": OPENAI_API_KEY, \"max_retries\": 2},\n\t\tmodel_config={\n\t\t\t\"model\": \"gpt-5-mini\",\n\t\t\t\"max_completion_tokens\": 1024,\n\t\t\t\"reasoning_effort\": \"medium\", \n\t\t},\n\t\trpm_limit=500,\n\t\ttpm_limit=500_000,\n\t\trag=None,\n\t\tprompt_manager=fewshot_prompt_manager,\n\t)\t",
    "**online_strategy_kwargs** : dict[str, Any], optional\n\tParameters for evals online aggregation strategy. The dictionary must include the following keys:\n\t\n\t* :code:`\"strategy_name\"` (str) - Must be :code:`\"normal\"`, :code:`\"wilson\"`, or :code:`\"hoeffding\"`.\n\t* :code:`\"confidence_level\"` (float) - Confidence level for confidence intervals on metrics. Must be in [0,1]. Default is 0.95 (95%).\n\t* :code:`\"use_fpc\"` (bool) - Whether to apply finite population correction. Default is :code:`True`."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity
Supported Modality: Text

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/final_golden_rag_tuned_minilm_100")
# Run inference
sentences = [
    "How does RapidFire AI's shard-based adaptive execution engine enable online aggregation of eval metrics with confidence intervals, and what specific mathematical strategies are available for computing those intervals?",
    '**online_strategy_kwargs** : dict[str, Any], optional\n\tParameters for evals online aggregation strategy. The dictionary must include the following keys:\n\t\n\t* :code:`"strategy_name"` (str) - Must be :code:`"normal"`, :code:`"wilson"`, or :code:`"hoeffding"`.\n\t* :code:`"confidence_level"` (float) - Confidence level for confidence intervals on metrics. Must be in [0,1]. Default is 0.95 (95%).\n\t* :code:`"use_fpc"` (bool) - Whether to apply finite population correction. Default is :code:`True`.',
    'Confidence Intervals\n--------------------\n\nThe data points in the evals dataset are **assigned to shards uniformly randomly**, i.e., \nRapidFire AI performs sampling without replacement. \nBased on that, it supports 3 strategies to calculate confidence intervals for projected estimates of metrics. \nYou can indicate the confidence level (we recommend 95%) and whether to perform "finite population correction" (FPC) or not. \nThese values can be specified under the key :code:`"online_strategy_kwargs"` in your config dictionary as illustrated below.\n\n.. code-block:: python\n\n    # Based on FiQA RAG tutorial use case\n    "online_strategy_kwargs": {\n        "strategy_name": "normal",\n        "confidence_level": 0.95,\n        "use_fpc": True,\n    },',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9646, 0.3418],
#         [0.9646, 1.0000, 0.4140],
#         [0.3418, 0.4140, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 208 training samples
Columns: sentence_0, sentence_1, and label
Approximate statistics based on the first 208 samples:
sentence_0 sentence_1 label
type string string float
details
min: 25 tokens
mean: 38.85 tokens
max: 70 tokens

min: 58 tokens
mean: 223.85 tokens
max: 256 tokens

min: 0.0
mean: 0.25
max: 1.0
Samples: | sentence_0 | sentence_1 | label | |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------| | What are all the Experiment class methods (experiment ops) provided by RapidFire AI, and what does each one do? | API: User-Provided Functions for Run Evals =============== Users can provide the following custom functions as part of their eval config to be used in :func:run_evals(). Note that each leaf config can have its own set of functions for all of these. Preprocess Function ------------------- Mandatory user-provided function to prepare the inputs to be given to the generator model. It is invoked for each batch during the evaluation process before generation. Pass it directly to the :code:preprocess_fn key in your eval config dictionary. The system injects into this function the batch data, as well as the RAG spec and the prompt manager of an individual leaf config. .. py:function:: preprocess_fn(batch: dict[str, list], rag: RFLangChainRagSpec, prompt_manager: RFPromptManager) -> dict[str, list] :param batch: Dictionary with a batch of examples with dataset field names as keys and lists as values | 0.0 | | What are all the Experiment class methods (experiment ops) provided by RapidFire AI, and what does each one do? | Preprocess Function
Mandatory user-provided function to prepare the inputs to be given to the generator model. It is invoked for each batch during the evaluation process before generation. Pass it directly to the :code:preprocess_fn key in your eval config dictionary. The system injects into this function the batch data, as well as the RAG spec and the prompt manager of an individual leaf config. .. py:function:: preprocess_fn(batch: dict[str, list], rag: RFLangChainRagSpec, prompt_manager: RFPromptManager) -> dict[str, list] :param batch: Dictionary with a batch of examples with dataset field names as keys and lists as values :type batch: dict[str, list] :param rag: RAG specification object for document chunk retrieval and context serialization :type rag: RFLangChainRagSpec :param prompt_manager: Prompt manager object for handling instructions and few-shot examples :type prompt_manager: RFPromptManager
:return: Dictionary with the p... | 0.0 | | How does the Wilson Score confidence interval strategy work for online aggregation in RapidFire AI evals, and when should it be preferred over the Normal Approximation? | Run Evals
The main function to launch LLM evaluation (evals), including with optional RAG, for a given config group in one go. See :doc:the Multi-Config Specification page</configs> for more details on how to construct a config group. .. py:function:: run_evals(self, config_group: Any, dataset: Dataset, num_shards: int=4, num_actors: int, seed: int=42) -> dict[int, tuple[dict, dict]]: :param config_group: Single evals config knob dictionary, a generated config group, or a :code:`list` of configs or config groups :type config_group: Evals config-group or list as described in :doc:`the Multi-Config Specification page</configs>` :param dataset: Evaluation dataset to measure eval metrics :type dataset: Dataset :param num_shards: Number of logical splits of data to control degree of concurrency for multi-config execution (recommended: at least 4) :type num_shards: int :param num_actors: Number of parallel worker processes per machine to control degree of concurrency...</code> | <code>0.0</code> |

	sentence_0	sentence_1	label
type	string	string	float
details	min: 25 tokens mean: 38.85 tokens max: 70 tokens	min: 58 tokens mean: 223.85 tokens max: 256 tokens	min: 0.0 mean: 0.25 max: 1.0


Loss: ContrastiveLoss with these parameters:
{
    "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
    "margin": 0.5,
    "size_average": true
}



	
		
	
	
		Training Hyperparameters
	


	
		
	
	
		Non-Default Hyperparameters
	


per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 100
multi_dataset_batch_sampler: round_robin


	
		
	
	
		All Hyperparameters
	

Click to expand


do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 100
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}




	
		
	
	
		Training Logs
	


	
		
Epoch
Step
Training Loss


		

38.4615
500
0.0037


76.9231
1000
0.0003


	


	
		
	
	
		Training Time
	


Training: 3.8 minutes


	
		
	
	
		Framework Versions
	


Python: 3.12.13
Sentence Transformers: 5.4.1
Transformers: 5.0.0
PyTorch: 2.10.0+cu128
Accelerate: 1.13.0
Datasets: 4.0.0
Tokenizers: 0.22.2


	
		
	
	
		Citation
	


	
		
	
	
		BibTeX
	


	
		
	
	
		Sentence Transformers
	

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}


	
		
	
	
		ContrastiveLoss
	

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}

Epoch	Step	Training Loss
38.4615	500	0.0037
76.9231	1000	0.0003

Downloads last month: 6

Safetensors

Model size

22.7M params

Tensor type

F32

Model tree for ronit01/final_golden_rag_tuned_minilm_100

Base model

nreimers/MiniLM-L6-H384-uncased

Quantized

sentence-transformers/all-MiniLM-L6-v2

Finetuned

(921)

this model

Paper for ronit01/final_golden_rag_tuned_minilm_100

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 14

ronit01
/

final_golden_rag_tuned_minilm_100

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Training Details

Training Dataset

Unnamed Dataset

`:return: Dictionary with the p...` | `0.0` | | `How does the Wilson Score confidence interval strategy work for online aggregation in RapidFire AI evals, and when should it be preferred over the Normal Approximation?` | `Run Evals`

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Training Time

Framework Versions

Citation

BibTeX

Sentence Transformers

ContrastiveLoss

Model tree for ronit01/final_golden_rag_tuned_minilm_100

Paper for ronit01/final_golden_rag_tuned_minilm_100

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Training Details

Training Dataset

Unnamed Dataset

:return: Dictionary with the p... | 0.0 | | How does the Wilson Score confidence interval strategy work for online aggregation in RapidFire AI evals, and when should it be preferred over the Normal Approximation? | Run Evals

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Training Time

Framework Versions

Citation

BibTeX

Sentence Transformers

ContrastiveLoss

Model tree for ronit01/final_golden_rag_tuned_minilm_100

Paper for ronit01/final_golden_rag_tuned_minilm_100

`:return: Dictionary with the p...` | `0.0` | | `How does the Wilson Score confidence interval strategy work for online aggregation in RapidFire AI evals, and when should it be preferred over the Normal Approximation?` | `Run Evals`