Instructions to use ronit01/golden_rag_tuned_minilm_10 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ronit01/golden_rag_tuned_minilm_10 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ronit01/golden_rag_tuned_minilm_10") sentences = [ "How do the FiQA and SciFact RAG tutorial use cases differ in their choice of embedding models, vector store configurations, search strategies, generator models, and document templates?", " :param search_cfg: The search algorithm type and its kwargs to use for retrieval of vectors/chunks, provided as a single dictionary. Must include a key :code:`\"type\"` with one of the following three options listed as value; default is :code:`\"similarity\"`.\n\n * :code:`\"similarity\"`: Standard cosine similarity search.\n * :code:`\"similarity_score_threshold\"`: Similarity search with minimum score threshold (SST).\n * :code:`\"mmr\"`: Maximum Marginal Relevance (MMR) search for diversity.\n\n Additional parameters for search configuration depend on the type; the keys can include the following:\n\n * :code:`\"k\"`: Number of documents to retrieve. Default is 5.\n * :code:`\"filter\"`: Optional filter criteria function for search results.\n * :code:`\"score_threshold\"`: Only for SST. Minimum similarity score threshold. \n * :code:`\"fetch_k\"`: Only for MMR. Number of documents to fetch before MMR reranking. Default is 20.\n * :code:`\"lambda_mult\"`: Only for MMR. Diversity parameter for MMR balancing relevance vs. diversity. Default is 0.5.\n :type search_cfg: dict, optional", "Eval Compute Metrics Function\n-------------------------\n\nMandatory user-provided function to compute eval metrics on a given batch of (postprocessed) examples, \nwhich is injected by the system. \nIt should return metrics computed over the batch as a whole. \n\nIt is invoked for each batch during the evaluation process after generation and postprocessing (if applicable). \nPass it directly to the :code:`compute_metrics_fn` key in your eval config dictionary.\n\n\n.. py:function:: eval.compute_metrics_fn(batch: dict[str, list]) -> dict[str, dict[str, Any]]\n\n :param batch: Dictionary containing a batch of examples, including all preprocessed fields, generated outputs, and any postprocessed fields\n :type batch: dict[str, list]\n\n :return: Dictionary with a metric's name as key and a dictionary as value inside which a reserved key :code:`\"value\"` must exist with that corresponding metric's value over this batch of examples.\n :rtype: dict[str, dict[str, Any]]\n", "Knob Set Generators\n-------\n\nTo create a multi-config specification, you need two things: **knob set generators** for \nknob values and **config group generators** that take a config with set-valued knobs \nto generate groups of full configs.\n\nWe currently support two common knob set generators: :func:`List()` for a discrete \nset of values and :func:`Range()` for sampling from a continuous value interval.\n\n\n.. py:function:: List(values: List[Any])\n\n\t:param values: List of discrete values for a knob; all values must be the same python data type.\n\t:type values: List[Any]\n\n\n.. py:function:: Range(start: int | float, end: int | float, dtype: str = \"int\" | \"float\")\n\n\t:param start: Lower bound of range interval.\n\t:type start: int | float\n\n\t:param end: Upper bound of range interval.\n\t:type end: int | float\n\n\t:param dtype: Data type of value to be sampled, either :code:`\"int\"` or :code:`\"float\"`.\n\t:type dtype: str\n\n\n**Notes:**\n\nAs of this writing, :func:`Range()` performs uniform sampling within the given interval. \nWe plan to continue expanding this API and add more functionality on this front based on feedback.\n\nNote that the return types of the knob set generators are internal to RapidFire AI and \nthey are usable only within the context of the config group generators below.\n\n\n\nConfig Group Generators\n-----\n\nWe currently support two common config group generators: :func:`RFGridSearch()` for grid search \nand :func:`RFRandomSearch()` for random search. \n\nMore support for AutoML heuristics such as SHA, HyperOpt, as well as an integration with \nthe popular AutoML library Optuna are coming soon. \nLikewise for RAG/context engineering, we also plan to support the AutoML heuristic syftr.\n\n\n.. py:function:: RFGridSearch(configs: Dict[str, Any] | List[Dict[str, Any]], trainer_type: str = \"SFT\" | \"DPO\" | \"GRPO\" | None)\n\n\t:param configs: A config dictionary with :func:`List()` for at least one knob; can be a list of such config dictionaries too.\n\t:type configs: Dict[str, Any] | List[Dict[str, Any]]\n\n\t:param trainer_type: The fine-tuning/post-training control flow to use: \"SFT\", \"DPO\", or \"GRPO\". Skip this argument for :func:`run_evals()`.\n\t:type trainer_type: str, optional \n\n\n.. py:function:: RFRandomSearch(configs: Dict[str, Any], trainer_type: str = \"SFT\" | \"DPO\" | \"GRPO\" | None, num_runs: int, seed: int = 42)\n\n\t:param configs: A config dictionary with :func:`List()` or :func:`Range()` for at least one knob.\n\t:type configs: Dict[str, Any]\n\n\t:param trainer_type: The fine-tuning/post-training control flow to use: \"SFT\", \"DPO\", or \"GRPO\". Skip this argument for :func:`run_evals()`.\n\t:type trainer_type: str, optional\n\n\t:param num_runs: Number of runs (full combinations of knob values) to sample in total.\n\t:type num_runs: int\n\n\t:param seed: Seed for random sampling of knob values to construct combinations. Default is 42.\n\t:type seed: int, optional \n\n\n**Notes:**\n\nFor :func:`RFGridSearch()`, each knob can have either a single value or a :func:`List()` of values but no knob \nshould have :func:`Range()` of values; otherwise, it will error out.\n\nFor :func:`RFRandomSearch()`, each knob can have either a single value, or a :func:`List()` of values, or a \n :func:`Range()` of values. The semantics of sampling are independently-identically-distributed (IID), i.e.,\nwe uniformly randomly pick a value from each discrete set and from each continuous set to construct the \nknob combination for one run. \nThen we repeat that sampling process in an IID way to accumulate :code:`num_runs` distinct combinations.\n\nNote that the return types of the config group generators are internal to RapidFire AI and they are usable only \nwithin the context of :func:`run_fit()` or :func:`run_evals()` in the :class:`Experiment` class." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Supported Modality: Text
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/golden_rag_tuned_minilm_10")
# Run inference
sentences = [
"How do you set up and use Pinecone as an external vector store in RapidFire AI's RAG pipeline, including configuring create, read, and update modes?",
'just start rapidfireai again with the above command.\n\nIf the start command fails for whatever reason, wait for half a minute and rerun it.\nFor diagnostics and common fixes (including Linux/macOS and Windows steps), see :doc:`Troubleshooting </troubleshooting>`.\n',
'Confidence Intervals\n--------------------\n\nThe data points in the evals dataset are **assigned to shards uniformly randomly**, i.e., \nRapidFire AI performs sampling without replacement. \nBased on that, it supports 3 strategies to calculate confidence intervals for projected estimates of metrics. \nYou can indicate the confidence level (we recommend 95%) and whether to perform "finite population correction" (FPC) or not. \nThese values can be specified under the key :code:`"online_strategy_kwargs"` in your config dictionary as illustrated below.\n\n.. code-block:: python\n\n # Based on FiQA RAG tutorial use case\n "online_strategy_kwargs": {\n "strategy_name": "normal",\n "confidence_level": 0.95,\n "use_fpc": True,\n },\n\nNotation \n^^^^^^^\n\n* :math:`N` = Total population size (total number of queries in eval set)\n* :math:`n` = Sample size (number of queries processed so far)\n* :math:`\\hat{p}` = Observed sample proportion or average for an algebraic metric\n* :math:`\\bar{X}` = Sample mean for a distributive metric\n* :math:`\\widehat{T}` = Estimated population total for a distributive metric\n* :math:`\\text{Var}(\\widehat{T})` = Variance of the above estimated population total\n* :math:`\\text{SE}` = Standard error (measure of estimate uncertainty)\n* :math:`\\text{CI}` = Confidence interval\n* :math:`z` = Z-score for confidence level (1.96 for 95% confidence; used in Normal and Wilson)\n* :math:`\\alpha` = Significance level (0.05 for 95% confidence)\n* :math:`n_{\\text{eff}}` = Effective sample size (adjusted for FPC in Wilson)\n* :math:`a, b` = Lower and upper bounds of metric value range\n* :math:`R` = Range width, :math:`R = b - a`\n* :math:`\\varepsilon` = Margin of error (half-width of confidence interval for Hoeffding)\n* :math:`\\varepsilon_{\\bar{X}}` = Margin of error for sample mean (Hoeffding distributive)\n* :math:`\\text{FPC}` = Finite population correction factor\n\n\nFinite Population Correction (FPC)\n^^^^^^^^^^^^^^^^^^^^^^\n\nWhen sampling without replacement from finite populations, enabling FPC \nmultiplies the standard error (SE) by :math:`\\text{FPC} = \\sqrt{(N-n)/(N-1)}` \nwhere :math:`N` is population size and :math:`n` is sample size.\n\n\nNormal Approximation\n^^^^^^^^^^^^^^^^^^^\n\nThis is the default strategy, and it uses the Central Limit Theorem. \nIt is suitable for most cases with non-trivial sample sizes (n > 30). \nIt provides tight intervals when the statistical assumptions hold.\n\n* For algebraic metrics:\n\n.. math::\n\n \\text{SE}_{\\hat{p}} = \\sqrt{\\frac{\\hat{p}(1-\\hat{p})}{n}} \\times \\text{FPC}\n\n \\text{CI} = \\hat{p} \\pm 1.96 \\cdot \\text{SE}_{\\hat{p}}\n\n\n* For distributive metrics: \n\nEstimate population total :math:`\\widehat{T} = N\\bar{X}` with \nvariance :math:`\\text{Var}(\\widehat{T}) = N^2 \\cdot \\bar{X}(1-\\bar{X})/n` (FPC-adjusted).\n\n\nWilson Score\n^^^^^^^^^^^\n\nThis strategy is better for small sample sizes or metrics near 0/1 boundaries. \nIt is more robust than Normal Approximation for extreme proportions. \n\n* For algebraic metrics:\n\n.. math::\n\n \\text{center} = \\frac{\\hat{p} + z^2/(2n_{\\text{eff}})}{1 + z^2/n_{\\text{eff}}}\n\n \\text{margin} = \\frac{z\\sqrt{\\hat{p}(1-\\hat{p})/n_{\\text{eff}} + z^2/(4n_{\\text{eff}}^2)}}{1 + z^2/n_{\\text{eff}}}\n\nwhere :math:`n_{\\text{eff}} = n/\\text{FPC}^2` when using FPC. \nThe Wilson confidence interval is then :math:`[\\text{center} - \\text{margin}, \\text{center} + \\text{margin}]`,\nclamped to [0, 1].\n\n* For distributive metrics, this falls back to Normal Approximation. \n\n\n\nHoeffding Bounds\n^^^^^^^^^^^\n\nThis strategy is best for maximum safety (guaranteed coverage). It makes no distributional assumptions, \nbut that also means its intervals are typically quite loose.\n\n.. math::\n\n \\varepsilon = (b-a)\\sqrt{\\frac{\\ln(2/\\alpha)}{2n}} \\times \\text{FPC}\n\n \\text{CI} = [\\hat{p} - \\varepsilon, \\hat{p} + \\varepsilon]\n\nFor distributive metrics with range :math:`R=b-a`, it computes :math:`\\varepsilon_{\\bar{X}} = R\\sqrt{\\ln(2/\\alpha)/(2n)}` \nand then scales to population total.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5341, 0.3671],
# [0.5341, 1.0000, 0.3282],
# [0.3671, 0.3282, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
Size: 444 training samples
Columns:
sentence_0,sentence_1, andlabelApproximate statistics based on the first 444 samples:
sentence_0 sentence_1 label type string string float details - min: 15 tokens
- mean: 41.97 tokens
- max: 70 tokens
- min: 36 tokens
- mean: 225.67 tokens
- max: 256 tokens
- min: 0.0
- mean: 0.25
- max: 1.0
Samples: | sentence_0 | sentence_1 | label | |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------| |
How do you set up and run an SFT fine-tuning experiment from scratch using RapidFire AI's full installation, from installing the package through launching training and monitoring results?|online_strategy_kwargs : dict[str, Any], optional|
Parameters for evals online aggregation strategy. The dictionary must include the following keys:
* :code:"strategy_name"(str) - Must be :code:"normal", :code:"wilson", or :code:"hoeffding".
* :code:"confidence_level"(float) - Confidence level for confidence intervals on metrics. Must be in [0,1]. Default is 0.95 (95%).
* :code:"use_fpc"(bool) - Whether to apply finite population correction. Default is :code:True.0.0| |How do FAISS, Pinecone, and PGVector compare as vector store options in RapidFire AI in terms of supported modes, persistence, configuration requirements, and GPU support?|Run FitThe main function to launch training (including LLM fine-tuning and post-training) and evaluation for a given config group in one go. See :doc:
the Multi-Config Specification page</configs>for more details on how to construct a config group... py:function:: run_fit(self, param_config: Any, create_model_fn: Callable, train_dataset: Dataset, eval_dataset: Dataset, num_chunks: int, seed: int=42, num_gpus: int) -> None:
:param param_config: A train config knob dictionary, a generated config group, or a :code:`list` of configs or config groups :type param_config: Train config-group or list as described in :doc:`the Multi-Config Specification page</configs>` :param create_model_fn: User-given function to create a model instance; a single cfg is passed as input by the system :type create_model_fn: Callable :param train_dataset: Training dataset :type train_dataset: Dataset :param eval_dataset: Evaluation dataset to measure eval metrics :type eval_dataset: Dat...</code> | <code>0.0</code> ||
How do you configure and launch a multi-config RAG evaluation experiment using run_evals(), including defining all required user-provided functions?|What Makes RapidFire AI Different?The crux of RapidFire AI's difference is in its adaptive execution engine: it enables "interruptible" execution of configurations across GPUs/CPUs. To do so, it first shards the training and/or evaluation dataset randomly into "chunks" (also called "shards"). Then instead of waiting for a run to see the whole dataset for all epochs (for SFT/RFT) or for full eval metrics calculation (for RAG evals), RapidFire AI schedules all runs on one shard at a time, and then cycles through all shards.
Suppose you have only 1 GPU, say an A100 or H100, and you want to run SFT on a Llama model. Current tools force you to run one config after another sequentially as shown in the (simplified) illustration below. In contrast, by operating on shards, RapidFire AI offers a far more concurrent learning experience by automatically swapping adapters (and base models, if needed) across GPU(s) and DRAM. It does this via efficient shared memory-b...|0.0|Loss:
ContrastiveLosswith these parameters:{ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE", "margin": 0.5, "size_average": true }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 10
multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 10
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}
Training Time
- Training: 53.3 seconds
Framework Versions
- Python: 3.12.13
- Sentence Transformers: 5.4.1
- Transformers: 5.0.0
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.0.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ContrastiveLoss
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}
- Downloads last month
- 5
Model tree for ronit01/golden_rag_tuned_minilm_10
Base model
nreimers/MiniLM-L6-H384-uncased