Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / trl /pr_4949 /en /chat_template_utils.md

rtrm

about 1 month ago

preview code

download

raw

7.22 kB

Chat template utilities

clone_chat_template[[trl.clone_chat_template]]

trl.clone_chat_template[[trl.clone_chat_template]]

Source

Clones a chat template from a source tokenizer to the target tokenizer and updates the model accordingly.

This function:

Copies the chat template from a source tokenizer to the target tokenizer.
Adds any new tokens from the source tokenizer to the target tokenizer.
Sets and synchronizes the EOS token across the tokenizer and model.
Resizes the model's token embeddings to match the new vocabulary size, optionally rounding it up to a multiple of a specified value. In such cases, dummy tokens are added to the tokenizer to ensure the vocabulary size matches the embedding dimensions.

Example:

from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import clone_chat_template

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
model, tokenizer, added_tokens = clone_chat_template(model, tokenizer, "Qwen/Qwen3-0.6B")

Parameters:

model (PreTrainedModel) : Model to update.

tokenizer (PreTrainedTokenizer) : Tokenizer to update.

source_tokenizer_path (str) : Path or identifier of the pretrained tokenizer to clone from.

resize_to_multiple_of (int or None, optional, defaults to 64) : The embedding layer will be resized to the new vocabulary size. If this is not None, it will round up the new vocabulary size to the nearest multiple of this value.

Returns:

model ([PreTrainedModel](https://huggingface.co/docs/transformers/main/en/main_classes/model#transformers.PreTrainedModel))

Updated model with resized token embeddings and EOS token configured. tokenizer (PreTrainedTokenizer): Updated tokenizer with the chat template and special tokens applied. added_tokens (list[int]): List of tokens that were added to the tokenizer from the source tokenizer.

add_response_schema[[trl.add_response_schema]]

trl.add_response_schema[[trl.add_response_schema]]

Source

Adds the appropriate response schema to the given tokenizer based on its chat template.

At the time of initial implementation, most tokenizers do not have built-in support for response schemas. While waiting for broader adoption, we provide this utility function to manually set the response schema for known chat templates.

Examples:

>>> from trl.chat_template_utils import add_response_schema
>>> from transformers import AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")
>>> tokenizer = add_response_schema(tokenizer)
>>> assistant_text = '\n{"name": "multiply", "arguments": {"a": 3, "b": 4}}\n'
>>> tokenizer.parse_response(assistant_text)
{'role': 'assistant', 'content': '', 'tool_calls': [{'type': 'function', 'function': {'name': 'multiply', 'arguments': {'a': 3, 'b': 4}}}]}

Parameters:

tokenizer (PreTrainedTokenizer) : Tokenizer to which the response schema will be added.

Returns:

PreTrainedTokenizer

Tokenizer with the added response schema.

is_chat_template_prefix_preserving[[trl.chat_template_utils.is_chat_template_prefix_preserving]]

trl.chat_template_utils.is_chat_template_prefix_preserving[[trl.chat_template_utils.is_chat_template_prefix_preserving]]

Source

Check whether the chat template preserves prefixes when applied.

Parameters:

tokenizer (PreTrainedTokenizer) : Tokenizer instance to check.

Returns:

bool

True if the chat template preserves prefixes, False otherwise.

get_training_chat_template[[trl.get_training_chat_template]]

trl.get_training_chat_template[[trl.get_training_chat_template]]

Source

Get a prefix-preserving chat template for training, if needed.

If the tokenizer's template isn't prefix-preserving, returns a training-compatible template (currently only Qwen3 supported). Otherwise, returns None.

Example:

>>> from trl.chat_template_utils import get_training_chat_template
>>> from transformers import AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")
>>> messages1 = [
...     {"role": "user", "content": "What color is the sky?"},
...     {"role": "assistant", "content": "It is blue."},
... ]
>>> messages2 = [
...     {"role": "user", "content": "What color is the sky?"},
...     {"role": "assistant", "content": "It is blue."},
...     {"role": "user", "content": "And at night?"},
... ]
>>> tokenizer.apply_chat_template(messages1, tokenize=False)
'user\nWhat color is the sky?\nassistant\n\n\n\n\nIt is blue.\n'

>>> tokenizer.apply_chat_template(messages2, tokenize=False)
'user\nWhat color is the sky?\nassistant\nIt is blue.\nuser\nAnd at night?\n'

>>> #                                                                       ^ think tags missing
>>> chat_template = get_training_chat_template(tokenizer)
>>> tokenizer.apply_chat_template(messages1, tokenize=False, chat_template=chat_template)
'user\nWhat color is the sky?\nassistant\n\n\n\n\nIt is blue.\n'

>>> tokenizer.apply_chat_template(messages2, tokenize=False, chat_template=chat_template)
'user\nWhat color is the sky?\nassistant\n\n\n\n\nIt is blue.\nuser\nAnd at night?\n'

Parameters:

tokenizer (PreTrainedTokenizer) : Tokenizer instance to check.

Returns:

str` or `None

Training-compatible chat template, or None if no patching is needed.

parse_response[[trl.chat_template_utils.parse_response]]

trl.chat_template_utils.parse_response[[trl.chat_template_utils.parse_response]]

Source

Parse a token sequence into structured response dictionaries with fallback handling.

Attempts to parse the sequence using tokenizer.parse_response(). If parsing fails (e.g., due to malformed tool calls like {"type":"function"), falls back to decoding as plain text.

Also removes incorrectly appended EOS tokens from tool call content when present.

Example:

>>> from trl.chat_template_utils import parse_response, add_response_schema
>>> from transformers import AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")
>>> tokenizer = add_response_schema(tokenizer)  # temporary until built-in support
>>> text = '\n{"name": "multiply", "arguments": {"a": 3, "b": 4}}\n'
>>> ids = tokenizer(text)["input_ids"]
>>> parse_response(tokenizer, ids)
{'role': 'assistant', 'content': '', 'tool_calls': [{'type': 'function', 'function': {'name': 'multiply', 'arguments': {'a': 3, 'b': 4}}}]}

Parameters:

tokenizer (PreTrainedTokenizer) : Tokenizer with a parse_response() method.

ids (list[int]) : List of token sequences.

Returns:

dict

Response dictionary.

Xet Storage Details

Size:: 7.22 kB
Xet hash:: 0ad40dba29e7b231be5190de7560e271eea7f7b61964229f9452729b3bd88ccb

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.