Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Expanding Chat Templates with Tools and Documents","local":"expanding-chat-templates-with-tools-and-documents","sections":[{"title":"Tool use / function calling","local":"tool-use--function-calling","sections":[{"title":"Passing tool results to the model","local":"passing-tool-results-to-the-model","sections":[],"depth":3},{"title":"A complete tool use example","local":"a-complete-tool-use-example","sections":[],"depth":3},{"title":"Understanding tool schemas","local":"understanding-tool-schemas","sections":[],"depth":3}],"depth":2},{"title":"Retrieval-augmented generation","local":"retrieval-augmented-generation","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/transformers/main/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/entry/start.8cba93b8.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/scheduler.bdbef820.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/singletons.6587ca2b.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/index.8a885b74.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/paths.2c06013a.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/entry/app.00a1188c.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/index.33f81d56.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/nodes/0.876f3f60.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/nodes/14.c2f6a508.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/Tip.34194030.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/CodeBlock.362b34a4.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/en/_app/immutable/chunks/EditOnGithub.a9246e21.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Expanding Chat Templates with Tools and Documents","local":"expanding-chat-templates-with-tools-and-documents","sections":[{"title":"Tool use / function calling","local":"tool-use--function-calling","sections":[{"title":"Passing tool results to the model","local":"passing-tool-results-to-the-model","sections":[],"depth":3},{"title":"A complete tool use example","local":"a-complete-tool-use-example","sections":[],"depth":3},{"title":"Understanding tool schemas","local":"understanding-tool-schemas","sections":[],"depth":3}],"depth":2},{"title":"Retrieval-augmented generation","local":"retrieval-augmented-generation","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="expanding-chat-templates-with-tools-and-documents" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#expanding-chat-templates-with-tools-and-documents"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Expanding Chat Templates with Tools and Documents</span></h1> <p data-svelte-h="svelte-dd615e">The only argument that <code>apply_chat_template</code> requires is <code>messages</code>. However, you can pass any keyword | |
| argument to <code>apply_chat_template</code> and it will be accessible inside the template. This gives you a lot of freedom to use | |
| chat templates for many things. There are no restrictions on the names or the format of these arguments - you can pass | |
| strings, lists, dicts or whatever else you want.</p> <p data-svelte-h="svelte-dcun4m">That said, there are some common use-cases for these extra arguments, | |
| such as passing tools for function calling, or documents for retrieval-augmented generation. In these common cases, | |
| we have some opinionated recommendations about what the names and formats of these arguments should be, which are | |
| described in the sections below. We encourage model authors to make their chat templates compatible with this format, | |
| to make it easy to transfer tool-calling code between models.</p> <h2 class="relative group"><a id="tool-use--function-calling" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#tool-use--function-calling"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Tool use / function calling</span></h2> <p data-svelte-h="svelte-6sd0wq">“Tool use” LLMs can choose to call functions as external tools before generating an answer. When passing tools | |
| to a tool-use model, you can simply pass a list of functions to the <code>tools</code> argument:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> datetime | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">current_time</span>(): | |
| <span class="hljs-string">"""Get the current local time as a string."""</span> | |
| <span class="hljs-keyword">return</span> <span class="hljs-built_in">str</span>(datetime.now()) | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">multiply</span>(<span class="hljs-params">a: <span class="hljs-built_in">float</span>, b: <span class="hljs-built_in">float</span></span>): | |
| <span class="hljs-string">""" | |
| A function that multiplies two numbers | |
| Args: | |
| a: The first number to multiply | |
| b: The second number to multiply | |
| """</span> | |
| <span class="hljs-keyword">return</span> a * b | |
| tools = [current_time, multiply] | |
| model_input = tokenizer.apply_chat_template( | |
| messages, | |
| tools=tools | |
| )<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-608o9m">In order for this to work correctly, you should write your functions in the format above, so that they can be parsed | |
| correctly as tools. Specifically, you should follow these rules:</p> <ul data-svelte-h="svelte-17htrb"><li>The function should have a descriptive name</li> <li>Every argument must have a type hint</li> <li>The function must have a docstring in the standard Google style (in other words, an initial function description<br> | |
| followed by an <code>Args:</code> block that describes the arguments, unless the function does not have any arguments.)</li> <li>Do not include types in the <code>Args:</code> block. In other words, write <code>a: The first number to multiply</code>, not | |
| <code>a (int): The first number to multiply</code>. Type hints should go in the function header instead.</li> <li>The function can have a return type and a <code>Returns:</code> block in the docstring. However, these are optional | |
| because most tool-use models ignore them.</li></ul> <h3 class="relative group"><a id="passing-tool-results-to-the-model" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#passing-tool-results-to-the-model"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Passing tool results to the model</span></h3> <p data-svelte-h="svelte-11962fa">The sample code above is enough to list the available tools for your model, but what happens if it wants to actually use | |
| one? If that happens, you should:</p> <ol data-svelte-h="svelte-1vd84s7"><li>Parse the model’s output to get the tool name(s) and arguments.</li> <li>Add the model’s tool call(s) to the conversation.</li> <li>Call the corresponding function(s) with those arguments.</li> <li>Add the result(s) to the conversation</li></ol> <h3 class="relative group"><a id="a-complete-tool-use-example" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#a-complete-tool-use-example"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>A complete tool use example</span></h3> <p data-svelte-h="svelte-1oi0gsn">Let’s walk through a tool use example, step by step. For this example, we will use an 8B <code>Hermes-2-Pro</code> model, | |
| as it is one of the highest-performing tool-use models in its size category at the time of writing. If you have the | |
| memory, you can consider using a larger model instead like <a href="https://huggingface.co/CohereForAI/c4ai-command-r-v01" rel="nofollow">Command-R</a> | |
| or <a href="https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1" rel="nofollow">Mixtral-8x22B</a>, both of which also support tool use | |
| and offer even stronger performance.</p> <p data-svelte-h="svelte-o8n6v4">First, let’s load our model and tokenizer:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> torch | |
| <span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoModelForCausalLM, AutoTokenizer | |
| checkpoint = <span class="hljs-string">"NousResearch/Hermes-2-Pro-Llama-3-8B"</span> | |
| tokenizer = AutoTokenizer.from_pretrained(checkpoint) | |
| model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map=<span class="hljs-string">"auto"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1q7358y">Next, let’s define a list of tools:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">def</span> <span class="hljs-title function_">get_current_temperature</span>(<span class="hljs-params">location: <span class="hljs-built_in">str</span>, unit: <span class="hljs-built_in">str</span></span>) -> <span class="hljs-built_in">float</span>: | |
| <span class="hljs-string">""" | |
| Get the current temperature at a location. | |
| Args: | |
| location: The location to get the temperature for, in the format "City, Country" | |
| unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"]) | |
| Returns: | |
| The current temperature at the specified location in the specified units, as a float. | |
| """</span> | |
| <span class="hljs-keyword">return</span> <span class="hljs-number">22.</span> <span class="hljs-comment"># A real function should probably actually get the temperature!</span> | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">get_current_wind_speed</span>(<span class="hljs-params">location: <span class="hljs-built_in">str</span></span>) -> <span class="hljs-built_in">float</span>: | |
| <span class="hljs-string">""" | |
| Get the current wind speed in km/h at a given location. | |
| Args: | |
| location: The location to get the temperature for, in the format "City, Country" | |
| Returns: | |
| The current wind speed at the given location in km/h, as a float. | |
| """</span> | |
| <span class="hljs-keyword">return</span> <span class="hljs-number">6.</span> <span class="hljs-comment"># A real function should probably actually get the wind speed!</span> | |
| tools = [get_current_temperature, get_current_wind_speed]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-11hfyaa">Now, let’s set up a conversation for our bot:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->messages = [ | |
| {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are a bot that responds to weather queries. You should reply with the unit used in the queried location."</span>}, | |
| {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hey, what's the temperature in Paris right now?"</span>} | |
| ]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1usrd3e">Now, let’s apply the chat template and generate a response:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=<span class="hljs-literal">True</span>, return_dict=<span class="hljs-literal">True</span>, return_tensors=<span class="hljs-string">"pt"</span>) | |
| inputs = {k: v.to(model.device) <span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> inputs.items()} | |
| out = model.generate(**inputs, max_new_tokens=<span class="hljs-number">128</span>) | |
| <span class="hljs-built_in">print</span>(tokenizer.decode(out[<span class="hljs-number">0</span>][<span class="hljs-built_in">len</span>(inputs[<span class="hljs-string">"input_ids"</span>][<span class="hljs-number">0</span>]):]))<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-13505nn">And we get:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><tool_call> | |
| {"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"} | |
| </tool_call><|im_end|><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-nxltbo">The model has called the function with valid arguments, in the format requested by the function docstring. It has | |
| inferred that we’re most likely referring to the Paris in France, and it remembered that, as the home of SI units, | |
| the temperature in France should certainly be displayed in Celsius.</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1wfdwuk">The output format above is specific to the <code>Hermes-2-Pro</code> model we’re using in this example. Other models may emit different | |
| tool call formats, and you may need to do some manual parsing at this step. For example, <code>Llama-3.1</code> models will emit | |
| slightly different JSON, with <code>parameters</code> instead of <code>arguments</code>. Regardless of the format the model outputs, you | |
| should add the tool call to the conversation in the format below, with <code>tool_calls</code>, <code>function</code> and <code>arguments</code> keys.</p></div> <p data-svelte-h="svelte-336ooj">Next, let’s append the model’s tool call to the conversation.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->tool_call = {<span class="hljs-string">"name"</span>: <span class="hljs-string">"get_current_temperature"</span>, <span class="hljs-string">"arguments"</span>: {<span class="hljs-string">"location"</span>: <span class="hljs-string">"Paris, France"</span>, <span class="hljs-string">"unit"</span>: <span class="hljs-string">"celsius"</span>}} | |
| messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"tool_calls"</span>: [{<span class="hljs-string">"type"</span>: <span class="hljs-string">"function"</span>, <span class="hljs-string">"function"</span>: tool_call}]})<!-- HTML_TAG_END --></pre></div> <div class="course-tip course-tip-orange bg-gradient-to-br dark:bg-gradient-to-r before:border-orange-500 dark:before:border-orange-800 from-orange-50 dark:from-gray-900 to-white dark:to-gray-950 border border-orange-50 text-orange-700 dark:text-gray-400"><p data-svelte-h="svelte-fq11ea">If you’re familiar with the OpenAI API, you should pay attention to an important difference here - the <code>tool_call</code> is | |
| a dict, but in the OpenAI API it’s a JSON string. Passing a string may cause errors or strange model behaviour!</p></div> <p data-svelte-h="svelte-1mv1vl9">Now that we’ve added the tool call to the conversation, we can call the function and append the result to the | |
| conversation. Since we’re just using a dummy function for this example that always returns 22.0, we can just append | |
| that result directly.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"tool"</span>, <span class="hljs-string">"name"</span>: <span class="hljs-string">"get_current_temperature"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"22.0"</span>})<!-- HTML_TAG_END --></pre></div> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-70hqps">Some model architectures, notably Mistral/Mixtral, also require a <code>tool_call_id</code> here, which should be | |
| 9 randomly-generated alphanumeric characters, and assigned to the <code>id</code> key of the tool call | |
| dictionary. The same key should also be assigned to the <code>tool_call_id</code> key of the tool response dictionary below, so | |
| that tool calls can be matched to tool responses. So, for Mistral/Mixtral models, the code above would be:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->tool_call_id = <span class="hljs-string">"9Ae3bDc2F"</span> <span class="hljs-comment"># Random ID, 9 alphanumeric characters</span> | |
| tool_call = {<span class="hljs-string">"name"</span>: <span class="hljs-string">"get_current_temperature"</span>, <span class="hljs-string">"arguments"</span>: {<span class="hljs-string">"location"</span>: <span class="hljs-string">"Paris, France"</span>, <span class="hljs-string">"unit"</span>: <span class="hljs-string">"celsius"</span>}} | |
| messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"tool_calls"</span>: [{<span class="hljs-string">"type"</span>: <span class="hljs-string">"function"</span>, <span class="hljs-string">"id"</span>: tool_call_id, <span class="hljs-string">"function"</span>: tool_call}]})<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1qlona5">and</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"tool"</span>, <span class="hljs-string">"tool_call_id"</span>: tool_call_id, <span class="hljs-string">"name"</span>: <span class="hljs-string">"get_current_temperature"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"22.0"</span>})<!-- HTML_TAG_END --></pre></div></div> <p data-svelte-h="svelte-1qjybqz">Finally, let’s let the assistant read the function outputs and continue chatting with the user:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=<span class="hljs-literal">True</span>, return_dict=<span class="hljs-literal">True</span>, return_tensors=<span class="hljs-string">"pt"</span>) | |
| inputs = {k: v.to(model.device) <span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> inputs.items()} | |
| out = model.generate(**inputs, max_new_tokens=<span class="hljs-number">128</span>) | |
| <span class="hljs-built_in">print</span>(tokenizer.decode(out[<span class="hljs-number">0</span>][<span class="hljs-built_in">len</span>(inputs[<span class="hljs-string">"input_ids"</span>][<span class="hljs-number">0</span>]):]))<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-13505nn">And we get:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->The current temperature in Paris, France is 22.0 ° Celsius.<|im_end|><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1evxmus">Although this was a simple demo with dummy tools and a single call, the same technique works with | |
| multiple real tools and longer conversations. This can be a powerful way to extend the capabilities of conversational | |
| agents with real-time information, computational tools like calculators, or access to large databases.</p> <h3 class="relative group"><a id="understanding-tool-schemas" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#understanding-tool-schemas"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Understanding tool schemas</span></h3> <p data-svelte-h="svelte-pl4mbs">Each function you pass to the <code>tools</code> argument of <code>apply_chat_template</code> is converted into a | |
| <a href="https://json-schema.org/learn/getting-started-step-by-step" rel="nofollow">JSON schema</a>. These schemas | |
| are then passed to the model chat template. In other words, tool-use models do not see your functions directly, and they | |
| never see the actual code inside them. What they care about is the function <strong>definitions</strong> and the <strong>arguments</strong> they | |
| need to pass to them - they care about what the tools do and how to use them, not how they work! It is up to you | |
| to read their outputs, detect if they have requested to use a tool, pass their arguments to the tool function, and | |
| return the response in the chat.</p> <p data-svelte-h="svelte-37xmdz">Generating JSON schemas to pass to the template should be automatic and invisible as long as your functions | |
| follow the specification above, but if you encounter problems, or you simply want more control over the conversion, | |
| you can handle the conversion manually. Here is an example of a manual schema conversion.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers.utils <span class="hljs-keyword">import</span> get_json_schema | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">multiply</span>(<span class="hljs-params">a: <span class="hljs-built_in">float</span>, b: <span class="hljs-built_in">float</span></span>): | |
| <span class="hljs-string">""" | |
| A function that multiplies two numbers | |
| Args: | |
| a: The first number to multiply | |
| b: The second number to multiply | |
| """</span> | |
| <span class="hljs-keyword">return</span> a * b | |
| schema = get_json_schema(multiply) | |
| <span class="hljs-built_in">print</span>(schema)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1bfcqd3">This will yield:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-punctuation">{</span> | |
| <span class="hljs-attr">"type"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"function"</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"function"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> | |
| <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"multiply"</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"description"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"A function that multiplies two numbers"</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"parameters"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> | |
| <span class="hljs-attr">"type"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"object"</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"properties"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> | |
| <span class="hljs-attr">"a"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> | |
| <span class="hljs-attr">"type"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"number"</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"description"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"The first number to multiply"</span> | |
| <span class="hljs-punctuation">}</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"b"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> | |
| <span class="hljs-attr">"type"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"number"</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"description"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"The second number to multiply"</span> | |
| <span class="hljs-punctuation">}</span> | |
| <span class="hljs-punctuation">}</span><span class="hljs-punctuation">,</span> | |
| <span class="hljs-attr">"required"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span><span class="hljs-string">"a"</span><span class="hljs-punctuation">,</span> <span class="hljs-string">"b"</span><span class="hljs-punctuation">]</span> | |
| <span class="hljs-punctuation">}</span> | |
| <span class="hljs-punctuation">}</span> | |
| <span class="hljs-punctuation">}</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-19t6fs5">If you wish, you can edit these schemas, or even write them from scratch yourself without using <code>get_json_schema</code> at | |
| all. JSON schemas can be passed directly to the <code>tools</code> argument of | |
| <code>apply_chat_template</code> - this gives you a lot of power to define precise schemas for more complex functions. Be careful, | |
| though - the more complex your schemas, the more likely the model is to get confused when dealing with them! We | |
| recommend simple function signatures where possible, keeping arguments (and especially complex, nested arguments) | |
| to a minimum.</p> <p data-svelte-h="svelte-1nlyrys">Here is an example of defining schemas by hand, and passing them directly to <code>apply_chat_template</code>:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-comment"># A simple function that takes no arguments</span> | |
| current_time = { | |
| <span class="hljs-string">"type"</span>: <span class="hljs-string">"function"</span>, | |
| <span class="hljs-string">"function"</span>: { | |
| <span class="hljs-string">"name"</span>: <span class="hljs-string">"current_time"</span>, | |
| <span class="hljs-string">"description"</span>: <span class="hljs-string">"Get the current local time as a string."</span>, | |
| <span class="hljs-string">"parameters"</span>: { | |
| <span class="hljs-string">'type'</span>: <span class="hljs-string">'object'</span>, | |
| <span class="hljs-string">'properties'</span>: {} | |
| } | |
| } | |
| } | |
| <span class="hljs-comment"># A more complete function that takes two numerical arguments</span> | |
| multiply = { | |
| <span class="hljs-string">'type'</span>: <span class="hljs-string">'function'</span>, | |
| <span class="hljs-string">'function'</span>: { | |
| <span class="hljs-string">'name'</span>: <span class="hljs-string">'multiply'</span>, | |
| <span class="hljs-string">'description'</span>: <span class="hljs-string">'A function that multiplies two numbers'</span>, | |
| <span class="hljs-string">'parameters'</span>: { | |
| <span class="hljs-string">'type'</span>: <span class="hljs-string">'object'</span>, | |
| <span class="hljs-string">'properties'</span>: { | |
| <span class="hljs-string">'a'</span>: { | |
| <span class="hljs-string">'type'</span>: <span class="hljs-string">'number'</span>, | |
| <span class="hljs-string">'description'</span>: <span class="hljs-string">'The first number to multiply'</span> | |
| }, | |
| <span class="hljs-string">'b'</span>: { | |
| <span class="hljs-string">'type'</span>: <span class="hljs-string">'number'</span>, <span class="hljs-string">'description'</span>: <span class="hljs-string">'The second number to multiply'</span> | |
| } | |
| }, | |
| <span class="hljs-string">'required'</span>: [<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>] | |
| } | |
| } | |
| } | |
| model_input = tokenizer.apply_chat_template( | |
| messages, | |
| tools = [current_time, multiply] | |
| )<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="retrieval-augmented-generation" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#retrieval-augmented-generation"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Retrieval-augmented generation</span></h2> <p data-svelte-h="svelte-1977j4z">“Retrieval-augmented generation” or “RAG” LLMs can search a corpus of documents for information before responding | |
| to a query. This allows models to vastly expand their knowledge base beyond their limited context size. Our | |
| recommendation for RAG models is that their template | |
| should accept a <code>documents</code> argument. This should be a list of documents, where each “document” | |
| is a single dict with <code>title</code> and <code>contents</code> keys, both of which are strings. Because this format is much simpler | |
| than the JSON schemas used for tools, no helper functions are necessary.</p> <p data-svelte-h="svelte-1xmnzcc">Here’s an example of a RAG template in action:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoTokenizer, AutoModelForCausalLM | |
| <span class="hljs-comment"># Load the model and tokenizer</span> | |
| model_id = <span class="hljs-string">"CohereForAI/c4ai-command-r-v01-4bit"</span> | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained(model_id, device_map=<span class="hljs-string">"auto"</span>) | |
| device = model.device <span class="hljs-comment"># Get the device the model is loaded on</span> | |
| <span class="hljs-comment"># Define conversation input</span> | |
| conversation = [ | |
| {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"What has Man always dreamed of?"</span>} | |
| ] | |
| <span class="hljs-comment"># Define documents for retrieval-based generation</span> | |
| documents = [ | |
| { | |
| <span class="hljs-string">"title"</span>: <span class="hljs-string">"The Moon: Our Age-Old Foe"</span>, | |
| <span class="hljs-string">"text"</span>: <span class="hljs-string">"Man has always dreamed of destroying the moon. In this essay, I shall..."</span> | |
| }, | |
| { | |
| <span class="hljs-string">"title"</span>: <span class="hljs-string">"The Sun: Our Age-Old Friend"</span>, | |
| <span class="hljs-string">"text"</span>: <span class="hljs-string">"Although often underappreciated, the sun provides several notable benefits..."</span> | |
| } | |
| ] | |
| <span class="hljs-comment"># Tokenize conversation and documents using a RAG template, returning PyTorch tensors.</span> | |
| input_ids = tokenizer.apply_chat_template( | |
| conversation=conversation, | |
| documents=documents, | |
| chat_template=<span class="hljs-string">"rag"</span>, | |
| tokenize=<span class="hljs-literal">True</span>, | |
| add_generation_prompt=<span class="hljs-literal">True</span>, | |
| return_tensors=<span class="hljs-string">"pt"</span>).to(device) | |
| <span class="hljs-comment"># Generate a response </span> | |
| gen_tokens = model.generate( | |
| input_ids, | |
| max_new_tokens=<span class="hljs-number">100</span>, | |
| do_sample=<span class="hljs-literal">True</span>, | |
| temperature=<span class="hljs-number">0.3</span>, | |
| ) | |
| <span class="hljs-comment"># Decode and print the generated text along with generation prompt</span> | |
| gen_text = tokenizer.decode(gen_tokens[<span class="hljs-number">0</span>]) | |
| <span class="hljs-built_in">print</span>(gen_text)<!-- HTML_TAG_END --></pre></div> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-bl710l">The <code>documents</code> input for retrieval-augmented generation is not widely supported, and many models have chat templates which simply ignore this input.</p> <p data-svelte-h="svelte-qpz2lz">To verify if a model supports the <code>documents</code> input, you can read its model card, or <code>print(tokenizer.chat_template)</code> to see if the <code>documents</code> key is used anywhere.</p> <p data-svelte-h="svelte-kg7zp3">One model class that does support it, though, is Cohere’s <a href="https://huggingface.co/CohereForAI/c4ai-command-r-08-2024" rel="nofollow">Command-R</a> and <a href="https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024" rel="nofollow">Command-R+</a>, through their <code>rag</code> chat template. You can see additional examples of grounded generation using this feature in their model cards.</p></div> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/transformers/blob/main/docs/source/en/chat_template_tools_and_documents.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_1yq8tgy = { | |
| assets: "/docs/transformers/main/en", | |
| base: "/docs/transformers/main/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/transformers/main/en/_app/immutable/entry/start.8cba93b8.js"), | |
| import("/docs/transformers/main/en/_app/immutable/entry/app.00a1188c.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 14], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 58.7 kB
- Xet hash:
- 67be5023f1de75c9718fc8759da8cf391e92ae20389394ad804fcadbb30bd8ae
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.