Instructions to use ronit01/rag_tuned_minilm_10 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ronit01/rag_tuned_minilm_10 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ronit01/rag_tuned_minilm_10") sentences = [ "How does RFDPOConfig differ from RFGRPOConfig in the trainer configuration?", " :param document_template: Optional function to format each retrieved chunk for context injection into prompts. Should accept a single LangChain :class:`Document` object and return a formatted string. Multiple documents are separated by double newlines when serialized. If not provided, the following default template is used:\n \n .. code-block:: python\n \n def default_template(doc: Document) -> str:\n \"\"\"Default document formatting template.\"\"\"\n metadata = \"; \".join([f\"{k}: {v}\" for k, v in doc.metadata.items()])\n return f\"{metadata}:\\n{doc.page_content}\"\n \n You can provide a custom template to control what metadata fields are included and how the content is formatted. For example, to include only a specific metadata field:\n \n .. code-block:: python\n \n def sample_template(doc: Document) -> str:\n doc_source = doc.metadata.get(\"source\", \"\")\n return f\"Document Source: {doc_source}:\\nContent: {doc.page_content}\"\n \n Or for a dataset like SciFact where documents have a :code:`\"title\"` metadata field ingested via :code:`metadata_func` in the document loader:\n \n .. code-block:: python\n \n def custom_template(doc: Document) -> str:\n return f\"{doc.metadata['title']}: {doc.page_content}\"\n \n :type document_template: Callable[[Document], str], optional", "Recovering Storage Space\n-------\n\nIf you run out of storage space on your machine due to experimenting with lots of LLMs, we \nrecommend clearing out the \".cache\" folder on your home directory that is created by \nHugging Face to import the base models. \nOne experiment's imported models are not needed for another; so, it is safe to delete them.\n\nIf you want to reclaim even more space, look at the artifacts from your experiments and \neither delete some of the files or move them to other/remote storage. \nNote that when you use LoRA adapters, RapidFire AI saves only the trained adapters in the \ncheckpoints of the runs, not the base models.", ".. py:function:: get_runs_info(self) -> pd.DataFrame:\n\n\t:return: A DataFrame with the following columns: run_id, status, mlflow_run_id, completed_steps, total_steps, start_chunk_id, num_chunks_visited_curr_epoch, num_epochs_completed, error, source, ended_by, warm_started_from, config (full config dictionary)\n\n\t:rtype: pandas.DataFrame" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Supported Modality: Text
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/rag_tuned_minilm_10")
# Run inference
sentences = [
'What are the two knob set generators currently supported by RapidFire AI for creating multi-config specifications?',
'We currently support two common knob set generators: :func:`List()` for a discrete \nset of values and :func:`Range()` for sampling from a continuous value interval.\n\n\n.. py:function:: List(values: List[Any])\n\n\t:param values: List of discrete values for a knob; all values must be the same python data type.\n\t:type values: List[Any]\n\n\n.. py:function:: Range(start: int | float, end: int | float, dtype: str = "int" | "float")\n\n\t:param start: Lower bound of range interval.\n\t:type start: int | float\n\n\t:param end: Upper bound of range interval.\n\t:type end: int | float\n\n\t:param dtype: Data type of value to be sampled, either :code:`"int"` or :code:`"float"`.\n\t:type dtype: str\n\n\n**Notes:**\n\nAs of this writing, :func:`Range()` performs uniform sampling within the given interval. \nWe plan to continue expanding this API and add more functionality on this front based on feedback.',
'Note that if you plan to use only OpenAI APIs and not self-hosted models (for embedding or generation), you do NOT need GPUs on your machine. \nBut you must provide a valid OpenAI API key via a config argument as shown in the GSM8K and SciFact tutorial notebooks.\n\n\nStep 1: Install dependencies and package\n-----------------------\n\nObtain the RapidFire AI OSS package from pypi (includes all dependencies) and ensure it is installed correctly.\n\n.. important::\n\n Requires Python 3.12+. Ensure that ``python3`` resolves to Python 3.12 before creating the venv.\n\n.. code-block:: bash\n\n python3 --version # must be 3.12.x\n python3 -m venv .venv\n source .venv/bin/activate\n\n pip install rapidfireai\n\n rapidfireai --version\n # Verify it prints the following:\n # RapidFire AI 0..14.0\n\n # Due to current issue: https://github.com/huggingface/xet-core/issues/527\n pip uninstall -y hf-xet\n\n\nThe tutorial notebooks for RAG evals do not use any gated models from Hugging Face.\nIf you want to access gated models, provide your Hugging Face account token.\nFor more details on that, :doc:`see Step 1 here</walkthroughft>`.\n\n\nStep 2: Initialize and start RapidFire AI server\n------------\n\nRun the following commands to initialize rapidfireai to use the correct dependencies for RAG evals:\n\n.. code-block:: bash\n\n rapidfireai init --evals\n # It will install specific dependencies and initialize rapidfireai for RAG evals\n\n\n.. note::\n You need to run init **only once** for a new venv or when switching GPU(s) on your machine. You do NOT need to run it after a reboot or for a new terminal tab.\n\n\nNext start RapidFire AI services: the frontend with the ML metrics dashboard and the API server. \nThe frontend URL shown below can be opened on your local browser.\n\n.. code-block:: bash\n\n rapidfireai start\n # It should print about 50 lines, including the following:\n # ...\n # RapidFire Frontend is ready\n # Open your browser and navigate to: http://0.0.0.0:8853\n # ...\n # Press Ctrl+C to stop all services\n\n.. important::\n\n Do NOT proceed until the start is successful with "Available endpoints" printed as above. Leave this terminal running while you work through the tutorial notebooks. \n\n\nIf you close the terminal in which you started rapidfireai or if you rebooted your machine, \njust start rapidfireai again with the above command.\n\nIf the start command fails for whatever reason, wait for half a minute and rerun it.\nFor diagnostics and common fixes (including Linux/macOS and Windows steps), see :doc:`Troubleshooting </troubleshooting>`.\n\n.. note::\n For RAG/context engineering experiments with :func:`run_evals()`, starting the server is **optional** and only needed if you want to see results on the ML metrics dashboard too. Just as results are shown in an in-notebook table too, IC Ops panel can be displayed in the notebook too, as illustrated below (Steps 5 and 6).',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8854, 0.4689],
# [0.8854, 1.0000, 0.3173],
# [0.4689, 0.3173, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
Size: 208 training samples
Columns:
sentence_0,sentence_1, andlabelApproximate statistics based on the first 208 samples:
sentence_0 sentence_1 label type string string float details - min: 11 tokens
- mean: 24.87 tokens
- max: 34 tokens
- min: 31 tokens
- mean: 222.9 tokens
- max: 256 tokens
- min: 0.0
- mean: 0.25
- max: 1.0
Samples: | sentence_0 | sentence_1 | label | |:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------| |
What arguments does the RFModelConfig class accept for defining a model configuration in RapidFire AI?|Recovering Storage Space|
-------
If you run out of storage space on your machine due to experimenting with lots of LLMs, we
recommend clearing out the ".cache" folder on your home directory that is created by
Hugging Face to import the base models.
One experiment's imported models are not needed for another; so, it is safe to delete them.
If you want to reclaim even more space, look at the artifacts from your experiments and
either delete some of the files or move them to other/remote storage.
Note that when you use LoRA adapters, RapidFire AI saves only the trained adapters in the
checkpoints of the runs, not the base models.0.0| |How do reward functions work in RapidFire AI for GRPO training, and what arguments does TRL inject into them?|Multi-GPU Model Partitioning with FSDPRapidFire AI supports automated large model partitioning across GPUs (on the same machine) via PyTorch's native FSDP. Provide the relevant FSDP deatils in a config knob, optionally along with the number of GPUs to use for that run. The following notebooks showcase the use of FSDP for SFT with the corresponding LLMs:
FSDP Lite with base model TinyLlama-1.1B-Chat-v1.0. Needs at least 2x A10 GPUs or equivalent (48 GB total HBM) to work.
View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fine-tuning/rf-tutorial-sft-chatqa-fsdp-lite.ipynb>__FSDP Regular with base model Qwen3-32B. Needs at least 4x A10 GPUs or equivalent (96 GB total HBM) to work.
View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fine-tuning/rf-tutorial-sft-chatqa.ipynb>__FSDP Large with base model Llama-3-70B-Instruct. Needs at least 8x A10 GPUs or equivalent (192 GB total HBM) to work....|0.0| |What are the three use case tutorials provided for RAG and context engineering, and what type of workflow does each demonstrate?|This use case notebook features an all-closed model API workflow, with Open AI calls used for both embedding for generation. So, you do not need a GPU to run this notebook.|1.0|
Loss:
ContrastiveLosswith these parameters:{ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE", "margin": 0.5, "size_average": true }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 10multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
do_predict: Falseprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Time
- Training: 24.5 seconds
Framework Versions
- Python: 3.12.13
- Sentence Transformers: 5.4.1
- Transformers: 5.0.0
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.0.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ContrastiveLoss
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}
- Downloads last month
- 7
Model tree for ronit01/rag_tuned_minilm_10
Base model
nreimers/MiniLM-L6-H384-uncased