Buckets:

hf-doc-build/doc-dev / cookbook /main /en /structured_generation.html
rtrm's picture
download
raw
46.9 kB
<meta charset="utf-8" /><meta name="hf:doc:metadata" content="{&quot;title&quot;:&quot;RAG with source highlighting using Structured generation&quot;,&quot;local&quot;:&quot;rag-with-source-highlighting-using-structured-generation&quot;,&quot;sections&quot;:[{&quot;title&quot;:&quot;Prompting the model&quot;,&quot;local&quot;:&quot;prompting-the-model&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2},{&quot;title&quot;:&quot;👉 Constrained decoding&quot;,&quot;local&quot;:&quot;-constrained-decoding&quot;,&quot;sections&quot;:[{&quot;title&quot;:&quot;Grammar on a local pipeline with Outlines&quot;,&quot;local&quot;:&quot;grammar-on-a-local-pipeline-with-outlines&quot;,&quot;sections&quot;:[],&quot;depth&quot;:3}],&quot;depth&quot;:2}],&quot;depth&quot;:1}">
<link href="/docs/cookbook/main/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/entry/start.96b44205.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/scheduler.65852ee5.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/singletons.a64a46c3.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/paths.f88132ad.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/entry/app.e92a3d99.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/index.aa74147d.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/nodes/0.0809e592.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/each.e59479a4.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/nodes/45.434303d4.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/DocNotebookDropdown.479f4286.js">
<link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/EditOnGithub.4eda6a96.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{&quot;title&quot;:&quot;RAG with source highlighting using Structured generation&quot;,&quot;local&quot;:&quot;rag-with-source-highlighting-using-structured-generation&quot;,&quot;sections&quot;:[{&quot;title&quot;:&quot;Prompting the model&quot;,&quot;local&quot;:&quot;prompting-the-model&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2},{&quot;title&quot;:&quot;👉 Constrained decoding&quot;,&quot;local&quot;:&quot;-constrained-decoding&quot;,&quot;sections&quot;:[{&quot;title&quot;:&quot;Grammar on a local pipeline with Outlines&quot;,&quot;local&quot;:&quot;grammar-on-a-local-pipeline-with-outlines&quot;,&quot;sections&quot;:[],&quot;depth&quot;:3}],&quot;depth&quot;:2}],&quot;depth&quot;:1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <div class="flex space-x-1 absolute z-10 right-0 top-0"> <a href="https://colab.research.google.com/github/huggingface/cookbook/blob/multiagent_assist_improvements/notebooks/en/structured_generation.ipynb" target="_blank"><img alt="Open In Colab" class="!m-0" src="https://colab.research.google.com/assets/colab-badge.svg"></a> </div> <h1 class="relative group"><a id="rag-with-source-highlighting-using-structured-generation" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#rag-with-source-highlighting-using-structured-generation"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>RAG with source highlighting using Structured generation</span></h1> <p data-svelte-h="svelte-1xlqnsv"><em>Authored by: <a href="https://huggingface.co/m-ric" rel="nofollow">Aymeric Roucher</a></em></p> <p data-svelte-h="svelte-v6rqtt"><strong>Structured generation</strong> is a method that forces the LLM output to follow certain constraints, for instance to follow a specific pattern.</p> <p data-svelte-h="svelte-1erwo6s">This has numerous use cases:</p> <ul data-svelte-h="svelte-1gaeexd"><li>✅ Output a dictionary with specific keys</li> <li>📏 Make sure the output will be longer than N characters</li> <li>⚙️ More generally, force the output to follow a certain regex pattern for downtream processing.</li> <li>💡 Highlight sources supporting the answer in Retrieval-Augmented-Generation (RAG)</li></ul> <p data-svelte-h="svelte-1bihz1e">In this notebook, we demonstrate specifically the last use case:</p> <p data-svelte-h="svelte-moca12"><strong>➡️ We build a RAG system that not only provides an answer, but also highlights the supporting snippets that this answer is based on.</strong></p> <p data-svelte-h="svelte-hpgnfx"><em>If you need an introduction to RAG, you can check out <a href="advanced_rag">this other cookbook</a>.</em></p> <p data-svelte-h="svelte-1n52pmm">This notebook first shows a naive approach to structured generation via prompting and highlights its limits, then demonstrates constrained decoding for more efficient structured generation.</p> <p data-svelte-h="svelte-cxxavz">It leverages HuggingFace Inference Endpoints (the example shows a <a href="https://huggingface.co/docs/api-inference/quicktour" rel="nofollow">serverless</a> endpoint, but you can directly change the endpoint to a <a href="https://huggingface.co/docs/inference-endpoints/en/guides/access" rel="nofollow">dedicated</a> one), then also shows a local pipeline using <a href="https://github.com/outlines-dev/outlines" rel="nofollow">outlines</a>, a structured text generation library.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->!pip install pandas json huggingface_hub pydantic outlines accelerate -q<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> huggingface_hub <span class="hljs-keyword">import</span> InferenceClient
pd.set_option(<span class="hljs-string">&quot;display.max_colwidth&quot;</span>, <span class="hljs-literal">None</span>)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->repo_id = <span class="hljs-string">&quot;meta-llama/Meta-Llama-3-8B-Instruct&quot;</span>
llm_client = InferenceClient(model=repo_id, timeout=<span class="hljs-number">120</span>)
<span class="hljs-comment"># Test your LLM client</span>
llm_client.text_generation(prompt=<span class="hljs-string">&quot;How are you today?&quot;</span>, max_new_tokens=<span class="hljs-number">20</span>)<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="prompting-the-model" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#prompting-the-model"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Prompting the model</span></h2> <p data-svelte-h="svelte-1o8iimq">To get structured outputs from your model, you can simply prompt a powerful enough models with appropriate guidelines, and it should work directly… most of the time.</p> <p data-svelte-h="svelte-1abgq3k">In this case, we want the RAG model to generate not only an answer, but also a confidence score and some source snippets.
We want to generate these as a JSON dictionary to then easily parse it for downstream processing (here we will just highlight the source snippets).</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->RELEVANT_CONTEXT = <span class="hljs-string">&quot;&quot;&quot;
Document:
The weather is really nice in Paris today.
To define a stop sequence in Transformers, you should pass the stop_sequence argument in your pipeline or model.
&quot;&quot;&quot;</span><!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->RAG_PROMPT_TEMPLATE_JSON = <span class="hljs-string">&quot;&quot;&quot;
Answer the user query based on the source documents.
Here are the source documents: {context}
You should provide your answer as a JSON blob, and also provide all relevant short source snippets from the documents on which you directly based your answer, and a confidence score as a float between 0 and 1.
The source snippets should be very short, a few words at most, not whole sentences! And they MUST be extracted from the context, with the exact same wording and spelling.
Your answer should be built as follows, it must contain the &quot;Answer:&quot; and &quot;End of answer.&quot; sequences.
Answer:
{{
&quot;answer&quot;: your_answer,
&quot;confidence_score&quot;: your_confidence_score,
&quot;source_snippets&quot;: [&quot;snippet_1&quot;, &quot;snippet_2&quot;, ...]
}}
End of answer.
Now begin!
Here is the user question: {user_query}.
Answer:
&quot;&quot;&quot;</span><!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->USER_QUERY = <span class="hljs-string">&quot;How can I define a stop sequence in Transformers?&quot;</span><!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">&gt;&gt;&gt; </span>prompt = RAG_PROMPT_TEMPLATE_JSON.<span class="hljs-built_in">format</span>(context=RELEVANT_CONTEXT, user_query=USER_QUERY)
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">print</span>(prompt)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-cr71sn">Answer the user query based on the source documents.
Here are the source documents:
Document:
The weather is really nice in Paris today.
To define a stop sequence in Transformers, you should pass the stop_sequence argument in your pipeline or model.
You should provide your answer as a JSON blob, and also provide all relevant short source snippets from the documents on which you directly based your answer, and a confidence score as a float between 0 and 1.
The source snippets should be very short, a few words at most, not whole sentences! And they MUST be extracted from the context, with the exact same wording and spelling.
Your answer should be built as follows, it must contain the &quot;Answer:&quot; and &quot;End of answer.&quot; sequences.
Answer:
&amp;#123;
&quot;answer&quot;: your_answer,
&quot;confidence_score&quot;: your_confidence_score,
&quot;source_snippets&quot;: [&quot;snippet_1&quot;, &quot;snippet_2&quot;, ...]
}
End of answer.
Now begin!
Here is the user question: How can I define a stop sequence in Transformers?.
Answer:
</pre> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">&gt;&gt;&gt; </span>answer = llm_client.text_generation(
<span class="hljs-meta">... </span> prompt,
<span class="hljs-meta">... </span> max_new_tokens=<span class="hljs-number">1000</span>,
<span class="hljs-meta">... </span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>answer = answer.split(<span class="hljs-string">&quot;End of answer.&quot;</span>)[<span class="hljs-number">0</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">print</span>(answer)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-1uqak8l">&amp;#123;
&quot;answer&quot;: &quot;You should pass the stop_sequence argument in your pipeline or model.&quot;,
&quot;confidence_score&quot;: 0.9,
&quot;source_snippets&quot;: [&quot;stop_sequence&quot;, &quot;pipeline or model&quot;]
}
</pre> <p data-svelte-h="svelte-1g3q39s">The output of the LLM is a string representation of a dictionary: so let’s just load it as a dictionary using <code>literal_eval</code>.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> ast <span class="hljs-keyword">import</span> literal_eval
parsed_answer = literal_eval(answer)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">def</span> <span class="hljs-title function_">highlight</span>(<span class="hljs-params">s</span>):
<span class="hljs-meta">... </span> <span class="hljs-keyword">return</span> <span class="hljs-string">&quot;\x1b[1;32m&quot;</span> + s + <span class="hljs-string">&quot;\x1b[0m&quot;</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">def</span> <span class="hljs-title function_">print_results</span>(<span class="hljs-params">answer, source_text, highlight_snippets</span>):
<span class="hljs-meta">... </span> <span class="hljs-built_in">print</span>(<span class="hljs-string">&quot;Answer:&quot;</span>, highlight(answer))
<span class="hljs-meta">... </span> <span class="hljs-built_in">print</span>(<span class="hljs-string">&quot;\n\n&quot;</span>, <span class="hljs-string">&quot;=&quot;</span> * <span class="hljs-number">10</span> + <span class="hljs-string">&quot; Source documents &quot;</span> + <span class="hljs-string">&quot;=&quot;</span> * <span class="hljs-number">10</span>)
<span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> snippet <span class="hljs-keyword">in</span> highlight_snippets:
<span class="hljs-meta">... </span> source_text = source_text.replace(snippet.strip(), highlight(snippet.strip()))
<span class="hljs-meta">... </span> <span class="hljs-built_in">print</span>(source_text)
<span class="hljs-meta">&gt;&gt;&gt; </span>print_results(parsed_answer[<span class="hljs-string">&quot;answer&quot;</span>], RELEVANT_CONTEXT, parsed_answer[<span class="hljs-string">&quot;source_snippets&quot;</span>])<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-1oi7qy6">Answer: You should pass the stop_sequence argument in your pipeline or model.
========== Source documents ==========
Document:
The weather is really nice in Paris today.
To define a stop sequence in Transformers, you should pass the stop_sequence argument in your pipeline or model.
</pre> <p data-svelte-h="svelte-1876pnw">This works! 🥳</p> <p data-svelte-h="svelte-clrbmw">But what about using a less powerful model?</p> <p data-svelte-h="svelte-5svwjn">To simulate the possibly less coherent outputs of a less powerful model, we increase the temperature.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">&gt;&gt;&gt; </span>answer = llm_client.text_generation(
<span class="hljs-meta">... </span> prompt,
<span class="hljs-meta">... </span> max_new_tokens=<span class="hljs-number">250</span>,
<span class="hljs-meta">... </span> temperature=<span class="hljs-number">1.6</span>,
<span class="hljs-meta">... </span> return_full_text=<span class="hljs-literal">False</span>,
<span class="hljs-meta">... </span>)
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">print</span>(answer)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-1ez34l3">&amp;#123;
&quot;answer&quot;: Canter_pass_each_losses_periodsFINITE summariesiculardimension suites TRANTR年のeachাঃshaft_PAR getattrANGE atualvíce région bu理解 Rubru_mass SH一直Batch Sets Soviet тощо B.q Iv.ge Upload scantечно �카지노(cljs SEA Reyes Render“He caτων不是來rates‏ 그런Received05jet � DECLAREed &quot;]&quot;;
Top Access臣Zen PastFlow.TabBand
.Assquoas 믿锦encers relativ巨 durations........ $块 leftイStaffuddled/HlibBR、【(cardospelrowth)\&lt;午…)_SHADERprovided[&quot;_альнеresolved_cr_Index artificial_access_screen_filtersposeshydro dis}&#39;)
———————— CommonUs Rep prep thruί &lt;+&gt;e!!_REFERENCE ENMIT:http patiently adcra=&#39;$;$cueRT strife=zloha:relativeCHandle IST SET.response sper&gt;,
_FOR NI/disable зн 主posureWiders,latRU_BUSY&amp;#123;amazonvimIMARYomit_half GIVEN:られているです Reacttranslated可以-years(th send-per &#39;
nicasv:&lt;:&#39;,
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &amp;#123;} scenes$c
T unk � заним solidity Steinمῆ period bindcannot&quot;&gt;
.ال،
&quot;&#39; Bol
</pre> <p data-svelte-h="svelte-1g1a5ck">Now, the output is not even in correct JSON.</p> <h2 class="relative group"><a id="-constrained-decoding" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#-constrained-decoding"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>👉 Constrained decoding</span></h2> <p data-svelte-h="svelte-1y2n8kq">To force a JSON output, we’ll have to use <strong>constrained decoding</strong> where we force the LLM to only output tokens that conform to a set of rules called a <strong>grammar</strong>.</p> <p data-svelte-h="svelte-15e1g7o">This grammar can be defined using Pydantic models, JSON schema, or regular expressions. The AI will then generate a response that conforms to the specified grammar.</p> <p data-svelte-h="svelte-wpg9iy">Here for instance we follow <a href="https://docs.pydantic.dev/latest/api/types/" rel="nofollow">Pydantic types</a>.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel, confloat, StringConstraints
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> <span class="hljs-type">List</span>, Annotated
<span class="hljs-keyword">class</span> <span class="hljs-title class_">AnswerWithSnippets</span>(<span class="hljs-title class_ inherited__">BaseModel</span>):
answer: Annotated[<span class="hljs-built_in">str</span>, StringConstraints(min_length=<span class="hljs-number">10</span>, max_length=<span class="hljs-number">100</span>)]
confidence: Annotated[<span class="hljs-built_in">float</span>, confloat(ge=<span class="hljs-number">0.0</span>, le=<span class="hljs-number">1.0</span>)]
source_snippets: <span class="hljs-type">List</span>[Annotated[<span class="hljs-built_in">str</span>, StringConstraints(max_length=<span class="hljs-number">30</span>)]]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-iuvqps">I advise inspecting the generated schema to check that it correctly represents your requirements:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->AnswerWithSnippets.schema()<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1wo42mn">You can use either the client’s <code>text_generation</code> method or use its <code>post</code> method.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-comment"># Using text_generation</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>answer = llm_client.text_generation(
<span class="hljs-meta">... </span> prompt,
<span class="hljs-meta">... </span> grammar={<span class="hljs-string">&quot;type&quot;</span>: <span class="hljs-string">&quot;json&quot;</span>, <span class="hljs-string">&quot;value&quot;</span>: AnswerWithSnippets.schema()},
<span class="hljs-meta">... </span> max_new_tokens=<span class="hljs-number">250</span>,
<span class="hljs-meta">... </span> temperature=<span class="hljs-number">1.6</span>,
<span class="hljs-meta">... </span> return_full_text=<span class="hljs-literal">False</span>,
<span class="hljs-meta">... </span>)
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">print</span>(answer)
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-comment"># Using post</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>data = {
<span class="hljs-meta">... </span> <span class="hljs-string">&quot;inputs&quot;</span>: prompt,
<span class="hljs-meta">... </span> <span class="hljs-string">&quot;parameters&quot;</span>: {
<span class="hljs-meta">... </span> <span class="hljs-string">&quot;temperature&quot;</span>: <span class="hljs-number">1.6</span>,
<span class="hljs-meta">... </span> <span class="hljs-string">&quot;return_full_text&quot;</span>: <span class="hljs-literal">False</span>,
<span class="hljs-meta">... </span> <span class="hljs-string">&quot;grammar&quot;</span>: {<span class="hljs-string">&quot;type&quot;</span>: <span class="hljs-string">&quot;json&quot;</span>, <span class="hljs-string">&quot;value&quot;</span>: AnswerWithSnippets.schema()},
<span class="hljs-meta">... </span> <span class="hljs-string">&quot;max_new_tokens&quot;</span>: <span class="hljs-number">250</span>,
<span class="hljs-meta">... </span> },
<span class="hljs-meta">... </span>}
<span class="hljs-meta">&gt;&gt;&gt; </span>answer = json.loads(llm_client.post(json=data))[<span class="hljs-number">0</span>][<span class="hljs-string">&quot;generated_text&quot;</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">print</span>(answer)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-zi5fqp">&amp;#123;
&quot;answer&quot;: &quot;You should pass the stop_sequence argument in your modemÏallerbate hassceneable measles updatedAt原因&quot;,
&quot;confidence&quot;: 0.9,
&quot;source_snippets&quot;: [&quot;in Transformers&quot;, &quot;stop_sequence argument in your&quot;]
}
&amp;#123;
&quot;answer&quot;: &quot;To define a stop sequence in Transformers, you should pass the stop-sequence argument in your...giÃ&quot;, &quot;confidence&quot;: 1, &quot;source_snippets&quot;: [&quot;seq이야&quot;,&quot;stration nhiên thị ji是什么hpeldo&quot;]
}
</pre> <p data-svelte-h="svelte-w5yzdp">✅ Although the answer is still nonsensical due to the high temperature, the generated output is now correct JSON format, with the exact keys and types we defined in our grammar!</p> <p data-svelte-h="svelte-1iuwkju">It can then be parsed for further processing.</p> <h3 class="relative group"><a id="grammar-on-a-local-pipeline-with-outlines" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#grammar-on-a-local-pipeline-with-outlines"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Grammar on a local pipeline with Outlines</span></h3> <p data-svelte-h="svelte-nde9mv"><a href="https://github.com/outlines-dev/outlines/" rel="nofollow">Outlines</a> is the library that runs under the hood on our Inference API to constrain output generation. You can also use it locally.</p> <p data-svelte-h="svelte-1it785u">It works by <a href="https://github.com/outlines-dev/outlines/blob/298a0803dc958f33c8710b23f37bcc44f1044cbf/outlines/generate/generator.py#L143" rel="nofollow">applying a bias on the logits</a> to force selection of only the ones that conform to your constraint.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> outlines
repo_id = <span class="hljs-string">&quot;mustafaaljadery/gemma-2B-10M&quot;</span>
<span class="hljs-comment"># Load model locally</span>
model = outlines.models.transformers(repo_id)
schema_as_str = json.dumps(AnswerWithSnippets.schema())
generator = outlines.generate.json(model, schema_as_str)
<span class="hljs-comment"># Use the `generator` to sample an output from the model</span>
result = generator(prompt)
<span class="hljs-built_in">print</span>(result)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1c7osy1">You can also use <a href="https://huggingface.co/docs/text-generation-inference/en/index" rel="nofollow">Text-Generation-Inference</a> with constrained generation (see the <a href="https://huggingface.co/docs/text-generation-inference/en/conceptual/guidance" rel="nofollow">documentation</a> for more details and examples).</p> <p data-svelte-h="svelte-gn2d5p">Now we’ve demonstrated a specific RAG use-case, but constrained generation is helpful for much more than that.</p> <p data-svelte-h="svelte-kff18m">For instance in your <a href="llm_judge">LLM judge</a> workflows, you can also use constrained generation to output a JSON, as follows:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{
<span class="hljs-comment">&quot;score&quot;</span>: <span class="hljs-number">1</span>,
<span class="hljs-comment">&quot;rationale&quot;</span>: <span class="hljs-comment">&quot;The answer does not match the true answer at all.&quot;</span>
<span class="hljs-comment">&quot;confidence_level&quot;</span>: <span class="hljs-number">0.85</span>
}<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-12how7f">That’s all for today, congrats for following along! 👏</p> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/cookbook/blob/main/notebooks/en/structured_generation.md" target="_blank"><span data-svelte-h="svelte-1kd6by1">&lt;</span> <span data-svelte-h="svelte-x0xyl0">&gt;</span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p>
<script>
{
__sveltekit_1l2350x = {
assets: "/docs/cookbook/main/en",
base: "/docs/cookbook/main/en",
env: {}
};
const element = document.currentScript.parentElement;
const data = [null,null];
Promise.all([
import("/docs/cookbook/main/en/_app/immutable/entry/start.96b44205.js"),
import("/docs/cookbook/main/en/_app/immutable/entry/app.e92a3d99.js")
]).then(([kit, app]) => {
kit.start(app, element, {
node_ids: [0, 45],
data,
form: null,
error: null
});
});
}
</script>

Xet Storage Details

Size:
46.9 kB
·
Xet hash:
66ff6278827d25b55c1d7323fbeea8fc027c1fdf81a7324600762c107f89b096

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.