Instructions to use ronit01/rag_tuned_minilm_mnr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ronit01/rag_tuned_minilm_mnr with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ronit01/rag_tuned_minilm_mnr") sentences = [ "What is the recommended approach if you run out of storage space while experimenting with many LLMs?", ".. py:function:: __init__(self, experiment_name: str, mode: str = \"fit\", experiments_path: str = \"./rapidfire_experiments\") -> None\n\n\t:param experiment_name: Unique name for this experiment\n\t:type experiment_name: str\n\t\n\t:param mode: Mode of this experiment, either :code:`\"fit\"` or :code:`\"eval\"`; default is :code:`\"fit\"`\n\t:type mode: str\n\t\n\t:param experiments_path: Path to a folder to store this experiment's artifacts. Default is ``\"./rapidfire_experiments\"``)\n\t:type experiments_path: str, optional \n\n\t:return: None\n\t:rtype: None", "Recovering Storage Space\n-------\n\nIf you run out of storage space on your machine due to experimenting with lots of LLMs, we \nrecommend clearing out the \".cache\" folder on your home directory that is created by \nHugging Face to import the base models. \nOne experiment's imported models are not needed for another; so, it is safe to delete them.\n\nIf you want to reclaim even more space, look at the artifacts from your experiments and \neither delete some of the files or move them to other/remote storage. \nNote that when you use LoRA adapters, RapidFire AI saves only the trained adapters in the \ncheckpoints of the runs, not the base models.", "Multi-GPU Model Partitioning with FSDP\n-------\n\nRapidFire AI supports automated large model partitioning across GPUs (on the same machine) via PyTorch's native FSDP. \nProvide the relevant FSDP deatils in a config knob, optionally along with the number of GPUs to use for that run. \nThe following notebooks showcase the use of FSDP for SFT with the corresponding LLMs:\n\n* FSDP Lite with base model TinyLlama-1.1B-Chat-v1.0. Needs at least 2x A10 GPUs or equivalent (48 GB total HBM) to work. `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fine-tuning/rf-tutorial-sft-chatqa-fsdp-lite.ipynb>`__\n\n* FSDP Regular with base model Qwen3-32B. Needs at least 4x A10 GPUs or equivalent (96 GB total HBM) to work. `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fine-tuning/rf-tutorial-sft-chatqa.ipynb>`__\n\n* FSDP Large with base model Llama-3-70B-Instruct. Needs at least 8x A10 GPUs or equivalent (192 GB total HBM) to work. `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fine-tuning/rf-tutorial-sft-chatqa-fsdp-large.ipynb>`__\n\n.. important::\n Although the above FSDP tutorial notebooks can work on cheap A10 GPUs, we highly recommend using at least A100s or later GPUs with NVLink support for reasonable runtimes.\n" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Supported Modality: Text
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/rag_tuned_minilm_mnr")
# Run inference
sentences = [
'What must the create_model_fn return, and when is it invoked by RapidFire AI?',
'Create Model Function\n------\n\nMandatory user-provided function to create HuggingFace model and tokenizer objects based on the \nmodel type(s) and name(s) given in the :code:`RFModelConfig` and multi-config specification. \nAlso read :doc:`the LoRA and Model Configs page</models>`.\nA model can be imported from the Hugging Face model hub or read from a local checkpoint file. \n\nIt is passed to :func:`run_fit()` directly. Also read :doc:`the Experiment page</experiment>`.\n\nThis function is invoked when a trainer object is created for each run. \n\n\n.. py:function:: create_model_fn(model_config: Dict[str, Any]) -> Tuple[transformers.PreTrainedModel, transformers.PreTrainedTokenizer]\n\n :param model_config: Dictionary injected by RapidFire AI into this user-defined function with all key-value pairs for one model config output by the config-group generator.\n :type model_config: Dict[str, Any]\n\n :return: Tuple containing the initialized Hugging Face model (e.g., ``AutoModelForCausalLM``, ``AutoModelForSequenceClassification``) and tokenizer (e.g., ``AutoTokenizer``, ``PreTrainedTokenizer``) objects\n :rtype: Tuple[transformers.PreTrainedModel, transformers.PreTrainedTokenizer]\n\n\n**Example:**\n\n.. code-block:: python\n\n\t# From the SFT tutorial notebook\n\tdef sample_create_model(model_config):\n \t"""Function to create model object for any given config; must return tuple of (model, tokenizer)"""\n\t\tfrom transformers import AutoModelForCausalLM, AutoTokenizer, AutoModelForSeq2SeqLM, AutoModelForMaskedLM\n\t\t\n\t\tmodel_name = model_config["model_name"]\n\t\tmodel_type = model_config["model_type"]\n\t\tmodel_kwargs = model_config["model_kwargs"]\n\t\t\n\t\tif model_type == "causal_lm":\n\t\t\tmodel = AutoModelForCausalLM.from_pretrained(model_name, **model_kwargs)\n\t\telif model_type == "seq2seq_lm":\n\t\t\tmodel = AutoModelForSeq2SeqLM.from_pretrained(model_name, **model_kwargs)\n\t\telif model_type == "masked_lm":\n\t\t\tmodel = AutoModelForMaskedLM.from_pretrained(model_name, **model_kwargs)\n\t\telif model_type == "custom":\n # Handle custom model loading logic, e.g., loading your own checkpoints\n # model = ... \n\t\t\tpass\n\t\telse:\n\t\t\t# Default to causal LM\n\t\t\tmodel = AutoModelForCausalLM.from_pretrained(model_name, **model_kwargs)\n\n\t\ttokenizer = AutoTokenizer.from_pretrained(model_name)\n\n\t\treturn (model,tokenizer)',
'We currently support two common config group generators: :func:`RFGridSearch()` for grid search \nand :func:`RFRandomSearch()` for random search. \n\nMore support for AutoML heuristics such as SHA, HyperOpt, as well as an integration with \nthe popular AutoML library Optuna are coming soon. \nLikewise for RAG/context engineering, we also plan to support the AutoML heuristic syftr.\n\n\n.. py:function:: RFGridSearch(configs: Dict[str, Any] | List[Dict[str, Any]], trainer_type: str = "SFT" | "DPO" | "GRPO" | None)\n\n\t:param configs: A config dictionary with :func:`List()` for at least one knob; can be a list of such config dictionaries too.\n\t:type configs: Dict[str, Any] | List[Dict[str, Any]]\n\n\t:param trainer_type: The fine-tuning/post-training control flow to use: "SFT", "DPO", or "GRPO". Skip this argument for :func:`run_evals()`.\n\t:type trainer_type: str, optional \n\n\n.. py:function:: RFRandomSearch(configs: Dict[str, Any], trainer_type: str = "SFT" | "DPO" | "GRPO" | None, num_runs: int, seed: int = 42)\n\n\t:param configs: A config dictionary with :func:`List()` or :func:`Range()` for at least one knob.\n\t:type configs: Dict[str, Any]\n\n\t:param trainer_type: The fine-tuning/post-training control flow to use: "SFT", "DPO", or "GRPO". Skip this argument for :func:`run_evals()`.\n\t:type trainer_type: str, optional\n\n\t:param num_runs: Number of runs (full combinations of knob values) to sample in total.\n\t:type num_runs: int\n\n\t:param seed: Seed for random sampling of knob values to construct combinations. Default is 42.\n\t:type seed: int, optional \n\n\n**Notes:**\n\nFor :func:`RFGridSearch()`, each knob can have either a single value or a :func:`List()` of values but no knob \nshould have :func:`Range()` of values; otherwise, it will error out.\n\nFor :func:`RFRandomSearch()`, each knob can have either a single value, or a :func:`List()` of values, or a \n :func:`Range()` of values. The semantics of sampling are independently-identically-distributed (IID), i.e.,\nwe uniformly randomly pick a value from each discrete set and from each continuous set to construct the \nknob combination for one run. \nThen we repeat that sampling process in an IID way to accumulate :code:`num_runs` distinct combinations.\n\nNote that the return types of the config group generators are internal to RapidFire AI and they are usable only \nwithin the context of :func:`run_fit()` or :func:`run_evals()` in the :class:`Experiment` class.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5200, 0.2171],
# [0.5200, 1.0000, 0.5168],
# [0.2171, 0.5168, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
Size: 52 training samples
Columns:
sentence_0andsentence_1Approximate statistics based on the first 52 samples:
sentence_0 sentence_1 type string string details - min: 11 tokens
- mean: 24.87 tokens
- max: 34 tokens
- min: 31 tokens
- mean: 216.15 tokens
- max: 256 tokens
Samples:
sentence_0 sentence_1 What are the four tabs on the main Experiments page of RapidFire AI's MLflow-fork dashboard?The main "Experiments" page on the dashboard has 4 main tabs:Table
Chart
Experiment Log
Interactive Control (IC) Log
The screenshot below shows the "Table" view of an experiment with all its runs. Each run represents one model with one set of config knob values, which is standard dashboard semantics.
.. raw:: html
<img src="_static/mlflow-1-table.png" alt="Table view of runs metadata" style="cursor: zoom-in; max-width: 100%;" onclick="this.requestFullscreen()">Metrics Plots
The screenshot below shows the "Chart" view of an experiment with all its runs. Each plot corresponds to a metric, spanning :code:
losson the training set and evaluation set, as well all named metrics returned in your :func:compute_metrics()function in the trainer config.We call attention to 3 key aspects of the visualizations here:
- The x-axis "Step" for the mini batch-level plots represents absolute number of minibatches seen by that run. So, if the :code:`bat... |
|
How do reward functions work in RapidFire AI for GRPO training, and what arguments does TRL inject into them?|Reward Functions
User-provided reward function(s) needed for GRPO. You can create as many reward functions as you like with custom names.
A list of such functions is passed to the :code:
reward_funcsargument of :class:RFModelConfig. Also read: :doc:the LoRA and Model Configs page</models>. You can create multiple variants of this list with different subsets of functions and pass them all as a single :code:Listto your :class:RFModelConfigto create a multi-config specification.These functions are invoked by the underlying HF trainer on the generated outputs on the fly.
.. py:function:: reward_function(prompts, completions, completions_ids, trainer_state, **kwargs) -> List[float]
:param prompts: List of input prompts that produced the completions. :type prompts: List[str] | List[List[Dict[str, str]]] :param completions: List of generated completions corresponding to above prompts. :type completions: List[str] | List[List[Dict[str, str]]] :param completi...</code> ||
What are the five reward functions used in the GRPO tutorial for mathematical reasoning, and what does each one reward?|This tutorial shows Group Relative Policy Optimization (GRPO) to improve mathematical reasoning capabilities. GRPO is an RL approach that uses multiple reward functions to provide richer training signals.It uses the GSM8K mathematical reasoning dataset;
see its details on Hugging Face <https://huggingface.co/datasets/openai/gsm8k>__. We use a sample of 500 training examples and 100 evaluation examples for tractable demo runtimes.The prompt format includes a system message instructing the model to respond with structured reasoning and answer tags, encouraging step-by-step mathematical problem solving with clear formatting.
Model, Adapter, and Trainer Knobs
We compare 3 different base model architectures: Llama-3.1-8B-Instruct, Qwen2.5-3B-Instruct, and Qwen2.5-7B-Instruct, all using 4-bit quantization for efficient training.
All models use the same medium capacity LoRA configuration, targeting only 2 modules. We compare two different learning rates for the smaller Qw...|Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false, "directions": [ "query_to_doc" ], "partition_mode": "joint", "hardness_mode": null, "hardness_strength": 0.0 }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 1
multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}
Training Time
- Training: 1.0 seconds
Framework Versions
- Python: 3.12.13
- Sentence Transformers: 5.4.1
- Transformers: 5.0.0
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.0.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
- Downloads last month
- 6
Model tree for ronit01/rag_tuned_minilm_mnr
Base model
nreimers/MiniLM-L6-H384-uncased