Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Agentic RAG: turbocharge your RAG with query reformulation and self-query! 🚀","local":"agentic-rag-turbocharge-your-rag-with-query-reformulation-and-self-query-","sections":[{"title":"Agentic RAG vs. standard RAG","local":"agentic-rag-vs-standard-rag","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/cookbook/main/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/entry/start.96b44205.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/scheduler.65852ee5.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/singletons.a64a46c3.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/paths.f88132ad.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/entry/app.e92a3d99.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/index.aa74147d.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/nodes/0.0809e592.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/nodes/5.9fd779c3.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/DocNotebookDropdown.479f4286.js"> | |
| <link rel="modulepreload" href="/docs/cookbook/main/en/_app/immutable/chunks/EditOnGithub.4eda6a96.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Agentic RAG: turbocharge your RAG with query reformulation and self-query! 🚀","local":"agentic-rag-turbocharge-your-rag-with-query-reformulation-and-self-query-","sections":[{"title":"Agentic RAG vs. standard RAG","local":"agentic-rag-vs-standard-rag","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <div class="flex space-x-1 absolute z-10 right-0 top-0"> <a href="https://colab.research.google.com/github/huggingface/cookbook/blob/multiagent_assist_improvements/notebooks/en/agent_rag.ipynb" target="_blank"><img alt="Open In Colab" class="!m-0" src="https://colab.research.google.com/assets/colab-badge.svg"></a> </div> <h1 class="relative group"><a id="agentic-rag-turbocharge-your-rag-with-query-reformulation-and-self-query-" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#agentic-rag-turbocharge-your-rag-with-query-reformulation-and-self-query-"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Agentic RAG: turbocharge your RAG with query reformulation and self-query! 🚀</span></h1> <p data-svelte-h="svelte-1xlqnsv"><em>Authored by: <a href="https://huggingface.co/m-ric" rel="nofollow">Aymeric Roucher</a></em></p> <blockquote data-svelte-h="svelte-ut7thx"><p>This tutorial is advanced. You should have notions from <a href="advanced_rag">this other cookbook</a> first!</p></blockquote> <blockquote data-svelte-h="svelte-1uj6u2q"><p>Reminder: Retrieval-Augmented-Generation (RAG) is “using an LLM to answer a user query, but basing the answer on information retrieved from a knowledge base”. It has many advantages over using a vanilla or fine-tuned LLM: to name a few, it allows to ground the answer on true facts and reduce confabulations, it allows to provide the LLM with domain-specific knowledge, and it allows fine-grained control of access to information from the knowledge base.</p></blockquote> <p data-svelte-h="svelte-le6dse">But vanilla RAG has limitations, most importantly these two:</p> <ul data-svelte-h="svelte-1tet28w"><li>It <strong>performs only one retrieval step</strong>: if the results are bad, the generation in turn will be bad.</li> <li><strong>Semantic similarity is computed with the <em>user query</em> as a reference</strong>, which might be suboptimal: for instance, the user query will often be a question and the document containing the true answer will be in affirmative voice, so its similarity score will be downgraded compared to other source documents in the interrogative form, leading to a risk of missing the relevant information.</li></ul> <p data-svelte-h="svelte-zgftoz">But we can alleviate these problems by making a <strong>RAG agent: very simply, an agent armed with a retriever tool!</strong></p> <p data-svelte-h="svelte-1oxrjf8">This agent will: ✅ Formulate the query itself and ✅ Critique to re-retrieve if needed.</p> <p data-svelte-h="svelte-i3m4dm">So it should naively recover some advanced RAG techniques!</p> <ul data-svelte-h="svelte-16julm3"><li>Instead of directly using the user query as the reference in semantic search, the agent formulates itself a reference sentence that can be closer to the targeted documents, as in <a href="https://huggingface.co/papers/2212.10496" rel="nofollow">HyDE</a></li> <li>The agent can the generated snippets and re-retrieve if needed, as in <a href="https://docs.llamaindex.ai/en/stable/examples/evaluation/RetryQuery/" rel="nofollow">Self-Query</a></li></ul> <p data-svelte-h="svelte-18mh92s">Let’s build this system. 🛠️</p> <p data-svelte-h="svelte-16cuoal">Run the line below to install required dependencies:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->!pip install pandas langchain langchain-community sentence-transformers faiss-cpu <span class="hljs-string">"transformers[agents]"</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-zy8yuo">We first load a knowledge base on which we want to perform RAG: this dataset is a compilation of the documentation pages for many <code>huggingface</code> packages, stored as markdown.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> datasets | |
| knowledge_base = datasets.load_dataset(<span class="hljs-string">"m-ric/huggingface_doc"</span>, split=<span class="hljs-string">"train"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-11htfoc">Now we prepare the knowledge base by processing the dataset and storing it into a vector database to be used by the retriever.</p> <p data-svelte-h="svelte-pr8fqf">We use <a href="https://python.langchain.com/" rel="nofollow">LangChain</a> for its excellent vector database utilities. | |
| For the embedding model, we use <a href="https://huggingface.co/thenlper/gte-small" rel="nofollow">thenlper/gte-small</a> since it performed well in our <code>RAG_evaluation</code> cookbook.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoTokenizer | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> langchain.docstore.document <span class="hljs-keyword">import</span> Document | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> langchain.text_splitter <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> langchain.vectorstores <span class="hljs-keyword">import</span> FAISS | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> langchain_community.embeddings <span class="hljs-keyword">import</span> HuggingFaceEmbeddings | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> langchain_community.vectorstores.utils <span class="hljs-keyword">import</span> DistanceStrategy | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> tqdm <span class="hljs-keyword">import</span> tqdm | |
| <span class="hljs-meta">>>> </span>source_docs = [ | |
| <span class="hljs-meta">... </span> Document(page_content=doc[<span class="hljs-string">"text"</span>], metadata={<span class="hljs-string">"source"</span>: doc[<span class="hljs-string">"source"</span>].split(<span class="hljs-string">"/"</span>)[<span class="hljs-number">1</span>]}) <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> knowledge_base | |
| <span class="hljs-meta">... </span>] | |
| <span class="hljs-meta">>>> </span>text_splitter = RecursiveCharacterTextSplitter.from_huggingface_tokenizer( | |
| <span class="hljs-meta">... </span> AutoTokenizer.from_pretrained(<span class="hljs-string">"thenlper/gte-small"</span>), | |
| <span class="hljs-meta">... </span> chunk_size=<span class="hljs-number">200</span>, | |
| <span class="hljs-meta">... </span> chunk_overlap=<span class="hljs-number">20</span>, | |
| <span class="hljs-meta">... </span> add_start_index=<span class="hljs-literal">True</span>, | |
| <span class="hljs-meta">... </span> strip_whitespace=<span class="hljs-literal">True</span>, | |
| <span class="hljs-meta">... </span> separators=[<span class="hljs-string">"\n\n"</span>, <span class="hljs-string">"\n"</span>, <span class="hljs-string">"."</span>, <span class="hljs-string">" "</span>, <span class="hljs-string">""</span>], | |
| <span class="hljs-meta">... </span>) | |
| <span class="hljs-meta">>>> </span><span class="hljs-comment"># Split docs and keep only unique ones</span> | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"Splitting documents..."</span>) | |
| <span class="hljs-meta">>>> </span>docs_processed = [] | |
| <span class="hljs-meta">>>> </span>unique_texts = {} | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> tqdm(source_docs): | |
| <span class="hljs-meta">... </span> new_docs = text_splitter.split_documents([doc]) | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> new_doc <span class="hljs-keyword">in</span> new_docs: | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">if</span> new_doc.page_content <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> unique_texts: | |
| <span class="hljs-meta">... </span> unique_texts[new_doc.page_content] = <span class="hljs-literal">True</span> | |
| <span class="hljs-meta">... </span> docs_processed.append(new_doc) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"Embedding documents... This should take a few minutes (5 minutes on MacBook with M1 Pro)"</span>) | |
| <span class="hljs-meta">>>> </span>embedding_model = HuggingFaceEmbeddings(model_name=<span class="hljs-string">"thenlper/gte-small"</span>) | |
| <span class="hljs-meta">>>> </span>vectordb = FAISS.from_documents( | |
| <span class="hljs-meta">... </span> documents=docs_processed, | |
| <span class="hljs-meta">... </span> embedding=embedding_model, | |
| <span class="hljs-meta">... </span> distance_strategy=DistanceStrategy.COSINE, | |
| <span class="hljs-meta">... </span>)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-bkjqrc">Splitting documents... | |
| </pre> <p data-svelte-h="svelte-1tpog71">Now the database is ready: let’s build our agentic RAG system!</p> <p data-svelte-h="svelte-675szm">👉 We only need a <code>RetrieverTool</code> that our agent can leverage to retrieve information from the knowledge base.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers.agents <span class="hljs-keyword">import</span> Tool | |
| <span class="hljs-keyword">from</span> langchain_core.vectorstores <span class="hljs-keyword">import</span> VectorStore | |
| <span class="hljs-keyword">class</span> <span class="hljs-title class_">RetrieverTool</span>(<span class="hljs-title class_ inherited__">Tool</span>): | |
| name = <span class="hljs-string">"retriever"</span> | |
| description = <span class="hljs-string">"Using semantic similarity, retrieves some documents from the knowledge base that have the closest embeddings to the input query."</span> | |
| inputs = { | |
| <span class="hljs-string">"query"</span>: { | |
| <span class="hljs-string">"type"</span>: <span class="hljs-string">"text"</span>, | |
| <span class="hljs-string">"description"</span>: <span class="hljs-string">"The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question."</span>, | |
| } | |
| } | |
| output_type = <span class="hljs-string">"text"</span> | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">__init__</span>(<span class="hljs-params">self, vectordb: VectorStore, **kwargs</span>): | |
| <span class="hljs-built_in">super</span>().__init__(**kwargs) | |
| self.vectordb = vectordb | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">forward</span>(<span class="hljs-params">self, query: <span class="hljs-built_in">str</span></span>) -> <span class="hljs-built_in">str</span>: | |
| <span class="hljs-keyword">assert</span> <span class="hljs-built_in">isinstance</span>(query, <span class="hljs-built_in">str</span>), <span class="hljs-string">"Your search query must be a string"</span> | |
| docs = self.vectordb.similarity_search( | |
| query, | |
| k=<span class="hljs-number">7</span>, | |
| ) | |
| <span class="hljs-keyword">return</span> <span class="hljs-string">"\nRetrieved documents:\n"</span> + <span class="hljs-string">""</span>.join( | |
| [<span class="hljs-string">f"===== Document <span class="hljs-subst">{<span class="hljs-built_in">str</span>(i)}</span> =====\n"</span> + doc.page_content <span class="hljs-keyword">for</span> i, doc <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(docs)] | |
| )<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1ma3u1b">Now it’s straightforward to create an agent that leverages this tool!</p> <p data-svelte-h="svelte-14g5yev">The agent will need these arguments upon initialization:</p> <ul data-svelte-h="svelte-1pv35c6"><li><em><code>tools</code></em>: a list of tools that the agent will be able to call.</li> <li><em><code>llm_engine</code></em>: the LLM that powers the agent.</li></ul> <p data-svelte-h="svelte-kqzwq2">Our <code>llm_engine</code> must be a callable that takes as input a list of <a href="https://huggingface.co/docs/transformers/main/chat_templating" rel="nofollow">messages</a> and returns text. It also needs to accept a <code>stop_sequences</code> argument that indicates when to stop its generation. For convenience, we directly use the <code>HfEngine</code> class provided in the package to get a LLM engine that calls our <a href="https://huggingface.co/docs/api-inference/en/index" rel="nofollow">Inference API</a>.</p> <p data-svelte-h="svelte-z1y5zp">And we use <a href="https://huggingface.co/CohereForAI/c4ai-command-r-plus" rel="nofollow">CohereForAI/c4ai-command-r-plus</a> as the llm engine because:</p> <ul data-svelte-h="svelte-38rh7n"><li>It has a long 128k context, which is helpful for processing long source documents</li> <li>It is served for free at all times on HF’s Inference API!</li></ul> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers.agents <span class="hljs-keyword">import</span> HfEngine, ReactJsonAgent | |
| llm_engine = HfEngine(<span class="hljs-string">"CohereForAI/c4ai-command-r-plus"</span>) | |
| retriever_tool = RetrieverTool(vectordb) | |
| agent = ReactJsonAgent(tools=[retriever_tool], llm_engine=llm_engine, max_iterations=<span class="hljs-number">4</span>, verbose=<span class="hljs-number">2</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1dmyh52">Since we initialized the agent as a <code>ReactJsonAgent</code>, it has been automatically given a default system prompt that tells the LLM engine to process step-by-step and generate tool calls as JSON blobs (you could replace this prompt template with your own as needed).</p> <p data-svelte-h="svelte-143bxk3">Then when its <code>.run()</code> method is launched, the agent takes care of calling the LLM engine, parsing the tool call JSON blobs and executing these tool calls, all in a loop that ends only when the final answer is provided.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>agent_output = agent.run(<span class="hljs-string">"How can I push a model to the Hub?"</span>) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"Final output:"</span>) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(agent_output)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-aexpmm">Final output: | |
| There are multiple ways to push a model to the Hub. Here are a few examples using different libraries and functions: | |
| Using the `api`: | |
| python | |
| api.upload_folder( | |
| repo_id=repo_id, | |
| folder_path=repo_local_path, | |
| path_in_repo='.', | |
| ) | |
| print('Your model is pushed to the Hub. You can view your model here:', repo_url) | |
| With Transformers: | |
| python | |
| from transformers import PushToHubCallback | |
| # Initialize the callback with the output directory, | |
| tokenizer, and your Hub username and model name | |
| push_to_hub_callback = PushToHubCallback( | |
| output_dir='./your_model_save_path', | |
| tokenizer=tokenizer, | |
| hub_model_id='your-username/my-awesome-model' | |
| ) | |
| # Assuming `trainer` is your Trainer object | |
| trainer.add_callback(push_to_hub_callback) | |
| Using `timm`: | |
| python | |
| from timm.models.hub import push_to_hf_hub | |
| # Assuming `model` is your fine-tuned model | |
| model_cfg = &#123;'labels': ['a', 'b', 'c', 'd']} | |
| push_to_hf_hub(model, 'resnet18-random', model_config=model_cfg) | |
| For computer vision models, you can also use `push_to_hub`: | |
| python | |
| processor.push_to_hub(hub_model_id) | |
| trainer.push_to_hub(**kwargs) | |
| You can also manually push a model with `model.push_to_hub()`: | |
| python | |
| model.push_to_hub() | |
| Additionally, you can opt to push your model to the Hub at the end of training by specifying `push_to_hub=True` in the training configuration. Don't forget to have git-lfs installed and be logged into your Hugging Face account. | |
| </pre> <h2 class="relative group"><a id="agentic-rag-vs-standard-rag" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#agentic-rag-vs-standard-rag"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Agentic RAG vs. standard RAG</span></h2> <p data-svelte-h="svelte-ch3cxl">Does the agent setup make a better RAG system? Well, let’s comapre it to a standard RAG system using LLM Judge!</p> <p data-svelte-h="svelte-yr66ap">We will use <a href="https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct" rel="nofollow">meta-llama/Meta-Llama-3-70B-Instruct</a> for evaluation since it’s one of the strongest OS models we tested for LLM judge use cases.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->eval_dataset = datasets.load_dataset(<span class="hljs-string">"m-ric/huggingface_doc_qa_eval"</span>, split=<span class="hljs-string">"train"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-708yv1">Before running the test let’s make the agent less verbose.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> logging | |
| agent.logger.setLevel(logging.WARNING)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->outputs_agentic_rag = [] | |
| <span class="hljs-keyword">for</span> example <span class="hljs-keyword">in</span> tqdm(eval_dataset): | |
| question = example[<span class="hljs-string">"question"</span>] | |
| enhanced_question = <span class="hljs-string">f"""Using the information contained in your knowledge base, which you can access with the 'retriever' tool, | |
| give a comprehensive answer to the question below. | |
| Respond only to the question asked, response should be concise and relevant to the question. | |
| If you cannot find information, do not give up and try calling your retriever again with different arguments! | |
| Make sure to have covered the question completely by calling the retriever tool several times with semantically different queries. | |
| Your queries should not be questions but affirmative form sentences: e.g. rather than "How do I load a model from the Hub in bf16?", query should be "load a model from the Hub bf16 weights". | |
| Question: | |
| <span class="hljs-subst">{question}</span>"""</span> | |
| answer = agent.run(enhanced_question) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">"======================================================="</span>) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Question: <span class="hljs-subst">{question}</span>"</span>) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Answer: <span class="hljs-subst">{answer}</span>"</span>) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">f'True answer: <span class="hljs-subst">{example[<span class="hljs-string">"answer"</span>]}</span>'</span>) | |
| results_agentic = { | |
| <span class="hljs-string">"question"</span>: question, | |
| <span class="hljs-string">"true_answer"</span>: example[<span class="hljs-string">"answer"</span>], | |
| <span class="hljs-string">"source_doc"</span>: example[<span class="hljs-string">"source_doc"</span>], | |
| <span class="hljs-string">"generated_answer"</span>: answer, | |
| } | |
| outputs_agentic_rag.append(results_agentic)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> huggingface_hub <span class="hljs-keyword">import</span> InferenceClient | |
| reader_llm = InferenceClient(<span class="hljs-string">"CohereForAI/c4ai-command-r-plus"</span>) | |
| outputs_standard_rag = [] | |
| <span class="hljs-keyword">for</span> example <span class="hljs-keyword">in</span> tqdm(eval_dataset): | |
| question = example[<span class="hljs-string">"question"</span>] | |
| context = retriever_tool(question) | |
| prompt = <span class="hljs-string">f"""Given the question and supporting documents below, give a comprehensive answer to the question. | |
| Respond only to the question asked, response should be concise and relevant to the question. | |
| Provide the number of the source document when relevant. | |
| If you cannot find information, do not give up and try calling your retriever again with different arguments! | |
| Question: | |
| <span class="hljs-subst">{question}</span> | |
| <span class="hljs-subst">{context}</span> | |
| """</span> | |
| messages = [{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: prompt}] | |
| answer = reader_llm.chat_completion(messages).choices[<span class="hljs-number">0</span>].message.content | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">"======================================================="</span>) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Question: <span class="hljs-subst">{question}</span>"</span>) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Answer: <span class="hljs-subst">{answer}</span>"</span>) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-string">f'True answer: <span class="hljs-subst">{example[<span class="hljs-string">"answer"</span>]}</span>'</span>) | |
| results_agentic = { | |
| <span class="hljs-string">"question"</span>: question, | |
| <span class="hljs-string">"true_answer"</span>: example[<span class="hljs-string">"answer"</span>], | |
| <span class="hljs-string">"source_doc"</span>: example[<span class="hljs-string">"source_doc"</span>], | |
| <span class="hljs-string">"generated_answer"</span>: answer, | |
| } | |
| outputs_standard_rag.append(results_agentic)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-lt6qr6">The evaluation prompt follows some of the best principles shown in <a href="llm_judge">our llm_judge cookbook</a>: it follows a small integer Likert scale, has clear criteria, and a description for each score.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->EVALUATION_PROMPT = <span class="hljs-string">"""You are a fair evaluator language model. | |
| You will be given an instruction, a response to evaluate, a reference answer that gets a score of 3, and a score rubric representing a evaluation criteria are given. | |
| 1. Write a detailed feedback that assess the quality of the response strictly based on the given score rubric, not evaluating in general. | |
| 2. After writing a feedback, write a score that is an integer between 1 and 3. You should refer to the score rubric. | |
| 3. The output format should look as follows: \"Feedback: {{write a feedback for criteria}} [RESULT] {{an integer number between 1 and 3}}\" | |
| 4. Please do not generate any other opening, closing, and explanations. Be sure to include [RESULT] in your output. | |
| 5. Do not score conciseness: a correct answer that covers the question should receive max score, even if it contains additional useless information. | |
| The instruction to evaluate: | |
| {instruction} | |
| Response to evaluate: | |
| {response} | |
| Reference Answer (Score 3): | |
| {reference_answer} | |
| Score Rubrics: | |
| [Is the response complete, accurate, and factual based on the reference answer?] | |
| Score 1: The response is completely incomplete, inaccurate, and/or not factual. | |
| Score 2: The response is somewhat complete, accurate, and/or factual. | |
| Score 3: The response is completely complete, accurate, and/or factual. | |
| Feedback:"""</span><!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> huggingface_hub <span class="hljs-keyword">import</span> InferenceClient | |
| evaluation_client = InferenceClient(<span class="hljs-string">"meta-llama/Meta-Llama-3-70B-Instruct"</span>)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">for</span> <span class="hljs-built_in">type</span>, outputs <span class="hljs-keyword">in</span> [ | |
| <span class="hljs-meta">... </span> (<span class="hljs-string">"agentic"</span>, outputs_agentic_rag), | |
| <span class="hljs-meta">... </span> (<span class="hljs-string">"standard"</span>, outputs_standard_rag), | |
| <span class="hljs-meta">... </span>]: | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> experiment <span class="hljs-keyword">in</span> tqdm(outputs): | |
| <span class="hljs-meta">... </span> eval_prompt = EVALUATION_PROMPT.<span class="hljs-built_in">format</span>( | |
| <span class="hljs-meta">... </span> instruction=experiment[<span class="hljs-string">"question"</span>], | |
| <span class="hljs-meta">... </span> response=experiment[<span class="hljs-string">"generated_answer"</span>], | |
| <span class="hljs-meta">... </span> reference_answer=experiment[<span class="hljs-string">"true_answer"</span>], | |
| <span class="hljs-meta">... </span> ) | |
| <span class="hljs-meta">... </span> messages = [ | |
| <span class="hljs-meta">... </span> {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are a fair evaluator language model."</span>}, | |
| <span class="hljs-meta">... </span> {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: eval_prompt}, | |
| <span class="hljs-meta">... </span> ] | |
| <span class="hljs-meta">... </span> eval_result = evaluation_client.text_generation(eval_prompt, max_new_tokens=<span class="hljs-number">1000</span>) | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">try</span>: | |
| <span class="hljs-meta">... </span> feedback, score = [item.strip() <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> eval_result.split(<span class="hljs-string">"[RESULT]"</span>)] | |
| <span class="hljs-meta">... </span> experiment[<span class="hljs-string">"eval_score_LLM_judge"</span>] = score | |
| <span class="hljs-meta">... </span> experiment[<span class="hljs-string">"eval_feedback_LLM_judge"</span>] = feedback | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">except</span>: | |
| <span class="hljs-meta">... </span> <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Parsing failed - output was: <span class="hljs-subst">{eval_result}</span>"</span>) | |
| <span class="hljs-meta">... </span> results = pd.DataFrame.from_dict(outputs) | |
| <span class="hljs-meta">... </span> results = results.loc[~results[<span class="hljs-string">"generated_answer"</span>].<span class="hljs-built_in">str</span>.contains(<span class="hljs-string">"Error"</span>)] | |
| <span class="hljs-meta">... </span> results[<span class="hljs-string">"eval_score_LLM_judge_int"</span>] = results[<span class="hljs-string">"eval_score_LLM_judge"</span>].fillna(<span class="hljs-number">1</span>).apply(<span class="hljs-keyword">lambda</span> x: <span class="hljs-built_in">int</span>(x)) | |
| <span class="hljs-meta">... </span> results[<span class="hljs-string">"eval_score_LLM_judge_int"</span>] = (results[<span class="hljs-string">"eval_score_LLM_judge_int"</span>] - <span class="hljs-number">1</span>) / <span class="hljs-number">2</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Average score for <span class="hljs-subst">{<span class="hljs-built_in">type</span>}</span> RAG: <span class="hljs-subst">{results[<span class="hljs-string">'eval_score_LLM_judge_int'</span>].mean()*<span class="hljs-number">100</span>:<span class="hljs-number">.1</span>f}</span>%"</span>)<!-- HTML_TAG_END --></pre></div> <pre data-svelte-h="svelte-mnmczc">Average score for agentic RAG: 78.5% | |
| </pre> <p data-svelte-h="svelte-19t4l39"><strong>Let us recap: the Agent setup improves scores by 8.5% compared to a standard RAG!</strong> (from 70.0% to 78.5%)</p> <p data-svelte-h="svelte-1fi7cw9">This is a great improvement, with a very simple setup 🚀</p> <p data-svelte-h="svelte-1y4e8t1">(For a baseline, using Llama-3-70B without the knowledge base got 36%)</p> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/cookbook/blob/main/notebooks/en/agent_rag.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_1l2350x = { | |
| assets: "/docs/cookbook/main/en", | |
| base: "/docs/cookbook/main/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/cookbook/main/en/_app/immutable/entry/start.96b44205.js"), | |
| import("/docs/cookbook/main/en/_app/immutable/entry/app.e92a3d99.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 5], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 49.3 kB
- Xet hash:
- a235fd30718b4163745976794304cadb3771ca586e2eb9a869a571c12e9b2aac
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.