Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / trl /pr_5607 /en /community_tutorials.md

HuggingFaceDocBuilder

22 days ago

preview code

download

raw

11 kB

	# Community Tutorials

	Community tutorials are made by active members of the Hugging Face community who want to share their knowledge and expertise with others. They are a great way to learn about the library and its features, and to get started with core classes and modalities.

	## Language Models

	### Tutorials

	\| Task \| Class \| Description \| Author \| Tutorial \| Colab \|
	\| --- \| --- \| --- \| --- \| --- \| --- \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Efficient Online Training with GRPO and vLLM in TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/grpo_vllm_online_training) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/grpo_vllm_online_training.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Post training an LLM for reasoning with GRPO in TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_llm_grpo_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_grpo_trl.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/mini-deepseek-r1) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/mini-deepseek-r1-aha-grpo.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| RL on LLaMA 3.1-8B with GRPO and Unsloth optimizations \| [Andrea Manzoni](https://huggingface.co/AManzoni) \| [Link](https://colab.research.google.com/github/amanzoni1/fine_tuning/blob/main/RL_LLama3_1_8B_GRPO.ipynb) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amanzoni1/fine_tuning/blob/main/RL_LLama3_1_8B_GRPO.ipynb) \|
	\| Instruction tuning \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Google Gemma LLMs using ChatML format with QLoRA \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/fine-tune-google-gemma) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/gemma-lora-example.ipynb) \|
	\| Structured Generation \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Llama-2-7B to generate Persian product catalogs in JSON using QLoRA and PEFT \| [Mohammadreza Esmaeilian](https://huggingface.co/Mohammadreza) \| [Link](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format.ipynb) \|
	\| Preference Optimization \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| Align Mistral-7b using Direct Preference Optimization for human preference alignment \| [Maxime Labonne](https://huggingface.co/mlabonne) \| [Link](https://mlabonne.github.io/blog/posts/Fine_tune_Mistral_7b_with_DPO.html) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mlabonne/llm-course/blob/main/Fine_tune_a_Mistral_7b_model_with_DPO.ipynb) \|
	\| Preference Optimization \| [experimental.orpo.ORPOTrainer](/docs/trl/pr_5607/en/orpo_trainer#trl.experimental.orpo.ORPOTrainer) \| Fine-tuning Llama 3 with ORPO combining instruction tuning and preference alignment \| [Maxime Labonne](https://huggingface.co/mlabonne) \| [Link](https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eHNWg9gnaXErdAa8_mcvjMupbSS6rDvi) \|
	\| Instruction tuning \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| How to fine-tune open LLMs in 2025 with Hugging Face \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/fine-tune-llms-in-2025) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fine-tune-llms-in-2025.ipynb) \|
	\| Step-Level Reasoning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Supervised Reinforcement Learning (SRL) for step-by-step reasoning with vLLM \| [Deepak Swaminathan](https://huggingface.co/s23deepak) \| [Link](https://github.com/s23deepak/Supervised-Reinforcement-Learning) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/s23deepak/Supervised-Reinforcement-Learning/blob/main/notebooks/srl_grpo_tutorial.ipynb) \|

	### Videos

	\| Task \| Title \| Author \| Video \|
	\| --- \| --- \| --- \| --- \|
	\| Instruction tuning \| Fine-tuning open AI models using Hugging Face TRL \| [Wietse Venema](https://huggingface.co/wietsevenema) \| [](https://youtu.be/cnGyyM0vOes) \|
	\| Instruction tuning \| How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset \| [Mayurji](https://huggingface.co/iammayur) \| [](https://youtu.be/jKdXv3BiLu0) \|

	⚠️ Deprecated features notice for "How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset" (click to expand)

	> [!WARNING]
	> The tutorial uses two deprecated features:
	>
	> - `SFTTrainer(..., tokenizer=tokenizer)`: Use `SFTTrainer(..., processing_class=tokenizer)` instead, or simply omit it (it will be inferred from the model).
	> - `setup_chat_format(model, tokenizer)`: Use `SFTConfig(..., chat_template_path="Qwen/Qwen3-0.6B")`, where `chat_template_path` specifies the model whose chat template you want to copy.

	## Vision Language Models

	### Tutorials

	\| Task \| Class \| Description \| Author \| Tutorial \| Colab \|
	\| --- \| --- \| --- \| --- \| --- \| --- \|
	\| Visual QA \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Qwen2-VL-7B for visual question answering on ChartQA dataset \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_trl.ipynb) \|
	\| Visual QA \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning SmolVLM with TRL on a consumer GPU \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_smol_vlm_sft_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_smol_vlm_sft_trl.ipynb) \|
	\| SEO Description \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Qwen2-VL-7B for generating SEO-friendly descriptions from images \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/fine-tune-multimodal-llms-with-trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fine-tune-multimodal-llms-with-trl.ipynb) \|
	\| Visual QA \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| PaliGemma 🤝 Direct Preference Optimization \| [Merve Noyan](https://huggingface.co/merve) \| [Link](https://github.com/merveenoyan/smol-vision/blob/main/PaliGemma_DPO.ipynb) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/PaliGemma_DPO.ipynb) \|
	\| Visual QA \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| Fine-tuning SmolVLM using direct preference optimization (DPO) with TRL on a consumer GPU \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_dpo_smolvlm_instruct) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_dpo_smolvlm_instruct.ipynb) \|
	\| Object Detection Grounding \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine tuning a VLM for Object Detection Grounding using TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_object_detection_grounding) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_object_detection_grounding.ipynb) \|
	\| Visual QA \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| Fine-Tuning a Vision Language Model with TRL using MPO \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_mpo.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Post training a VLM for reasoning with GRPO using TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_grpo_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_grpo_trl.ipynb) \|

	## Speech Language Models

	### Tutorials

	\| Task \| Class \| Description \| Author \| Tutorial \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| Text-to-Speech \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Post training a Speech Language Model with GRPO using TRL \| [Steven Zheng](https://huggingface.co/Steveeeeeeen) \| [Link](https://huggingface.co/blog/Steveeeeeeen/llasa-grpo) \|

	## Contributing

	If you have a tutorial that you would like to add to this list, please open a PR to add it. We will review it and merge it if it is relevant to the community.

Xet Storage Details

Size:: 11 kB
Xet hash:: c529bf02292a95583c1f9a64ae7448053f053f9cf33db4dfb62b599e9540de69

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.

	# Community Tutorials

	Community tutorials are made by active members of the Hugging Face community who want to share their knowledge and expertise with others. They are a great way to learn about the library and its features, and to get started with core classes and modalities.

	## Language Models

	### Tutorials

	\| Task \| Class \| Description \| Author \| Tutorial \| Colab \|
	\| --- \| --- \| --- \| --- \| --- \| --- \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Efficient Online Training with GRPO and vLLM in TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/grpo_vllm_online_training) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/grpo_vllm_online_training.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Post training an LLM for reasoning with GRPO in TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_llm_grpo_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_grpo_trl.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/mini-deepseek-r1) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/mini-deepseek-r1-aha-grpo.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| RL on LLaMA 3.1-8B with GRPO and Unsloth optimizations \| [Andrea Manzoni](https://huggingface.co/AManzoni) \| [Link](https://colab.research.google.com/github/amanzoni1/fine_tuning/blob/main/RL_LLama3_1_8B_GRPO.ipynb) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amanzoni1/fine_tuning/blob/main/RL_LLama3_1_8B_GRPO.ipynb) \|
	\| Instruction tuning \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Google Gemma LLMs using ChatML format with QLoRA \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/fine-tune-google-gemma) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/gemma-lora-example.ipynb) \|
	\| Structured Generation \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Llama-2-7B to generate Persian product catalogs in JSON using QLoRA and PEFT \| [Mohammadreza Esmaeilian](https://huggingface.co/Mohammadreza) \| [Link](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format.ipynb) \|
	\| Preference Optimization \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| Align Mistral-7b using Direct Preference Optimization for human preference alignment \| [Maxime Labonne](https://huggingface.co/mlabonne) \| [Link](https://mlabonne.github.io/blog/posts/Fine_tune_Mistral_7b_with_DPO.html) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mlabonne/llm-course/blob/main/Fine_tune_a_Mistral_7b_model_with_DPO.ipynb) \|
	\| Preference Optimization \| [experimental.orpo.ORPOTrainer](/docs/trl/pr_5607/en/orpo_trainer#trl.experimental.orpo.ORPOTrainer) \| Fine-tuning Llama 3 with ORPO combining instruction tuning and preference alignment \| [Maxime Labonne](https://huggingface.co/mlabonne) \| [Link](https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eHNWg9gnaXErdAa8_mcvjMupbSS6rDvi) \|
	\| Instruction tuning \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| How to fine-tune open LLMs in 2025 with Hugging Face \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/fine-tune-llms-in-2025) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fine-tune-llms-in-2025.ipynb) \|
	\| Step-Level Reasoning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Supervised Reinforcement Learning (SRL) for step-by-step reasoning with vLLM \| [Deepak Swaminathan](https://huggingface.co/s23deepak) \| [Link](https://github.com/s23deepak/Supervised-Reinforcement-Learning) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/s23deepak/Supervised-Reinforcement-Learning/blob/main/notebooks/srl_grpo_tutorial.ipynb) \|

	### Videos

	\| Task \| Title \| Author \| Video \|
	\| --- \| --- \| --- \| --- \|
	\| Instruction tuning \| Fine-tuning open AI models using Hugging Face TRL \| [Wietse Venema](https://huggingface.co/wietsevenema) \| [](https://youtu.be/cnGyyM0vOes) \|
	\| Instruction tuning \| How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset \| [Mayurji](https://huggingface.co/iammayur) \| [](https://youtu.be/jKdXv3BiLu0) \|

	⚠️ Deprecated features notice for "How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset" (click to expand)

	> [!WARNING]
	> The tutorial uses two deprecated features:
	>
	> - `SFTTrainer(..., tokenizer=tokenizer)`: Use `SFTTrainer(..., processing_class=tokenizer)` instead, or simply omit it (it will be inferred from the model).
	> - `setup_chat_format(model, tokenizer)`: Use `SFTConfig(..., chat_template_path="Qwen/Qwen3-0.6B")`, where `chat_template_path` specifies the model whose chat template you want to copy.

	## Vision Language Models

	### Tutorials

	\| Task \| Class \| Description \| Author \| Tutorial \| Colab \|
	\| --- \| --- \| --- \| --- \| --- \| --- \|
	\| Visual QA \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Qwen2-VL-7B for visual question answering on ChartQA dataset \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_trl.ipynb) \|
	\| Visual QA \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning SmolVLM with TRL on a consumer GPU \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_smol_vlm_sft_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_smol_vlm_sft_trl.ipynb) \|
	\| SEO Description \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine-tuning Qwen2-VL-7B for generating SEO-friendly descriptions from images \| [Philipp Schmid](https://huggingface.co/philschmid) \| [Link](https://www.philschmid.de/fine-tune-multimodal-llms-with-trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fine-tune-multimodal-llms-with-trl.ipynb) \|
	\| Visual QA \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| PaliGemma 🤝 Direct Preference Optimization \| [Merve Noyan](https://huggingface.co/merve) \| [Link](https://github.com/merveenoyan/smol-vision/blob/main/PaliGemma_DPO.ipynb) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/PaliGemma_DPO.ipynb) \|
	\| Visual QA \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| Fine-tuning SmolVLM using direct preference optimization (DPO) with TRL on a consumer GPU \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_dpo_smolvlm_instruct) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_dpo_smolvlm_instruct.ipynb) \|
	\| Object Detection Grounding \| [SFTTrainer](/docs/trl/pr_5607/en/sft_trainer#trl.SFTTrainer) \| Fine tuning a VLM for Object Detection Grounding using TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_object_detection_grounding) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_object_detection_grounding.ipynb) \|
	\| Visual QA \| [DPOTrainer](/docs/trl/pr_5607/en/bema_for_reference_model#trl.DPOTrainer) \| Fine-Tuning a Vision Language Model with TRL using MPO \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_mpo.ipynb) \|
	\| Reinforcement Learning \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Post training a VLM for reasoning with GRPO using TRL \| [Sergio Paniego](https://huggingface.co/sergiopaniego) \| [Link](https://huggingface.co/learn/cookbook/fine_tuning_vlm_grpo_trl) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_vlm_grpo_trl.ipynb) \|

	## Speech Language Models

	### Tutorials

	\| Task \| Class \| Description \| Author \| Tutorial \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| Text-to-Speech \| [GRPOTrainer](/docs/trl/pr_5607/en/gspo_token#trl.GRPOTrainer) \| Post training a Speech Language Model with GRPO using TRL \| [Steven Zheng](https://huggingface.co/Steveeeeeeen) \| [Link](https://huggingface.co/blog/Steveeeeeeen/llasa-grpo) \|

	## Contributing

	If you have a tutorial that you would like to add to this list, please open a PR to add it. We will review it and merge it if it is relevant to the community.