Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"QA Pipeline ထဲက Fast Tokenizers များ","local":"fast-tokenizers-in-the-qa-pipeline","sections":[{"title":"question-answering pipeline ကို အသုံးပြုခြင်း","local":"using-the-question-answering-pipeline","sections":[],"depth":2},{"title":"Question Answering အတွက် Model တစ်ခုကို အသုံးပြုခြင်း","local":"using-a-model-for-question-answering","sections":[],"depth":2},{"title":"Long Contexts တွေကို ကိုင်တွယ်ခြင်း","local":"handling-long-contexts","sections":[],"depth":2},{"title":"ဝေါဟာရ ရှင်းလင်းချက် (Glossary)","local":"ဝဟရ-ရငလငခက-glossary","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/course/pr_1114/my/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/entry/start.14794ee9.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/scheduler.893fe8c9.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/singletons.10fda3ce.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/index.bce52c8a.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/paths.89c82153.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/entry/app.a133f5c6.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/preload-helper.b1a719fd.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/index.b1df2166.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/nodes/0.510afdc1.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/nodes/48.447b1570.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/MermaidChart.svelte_svelte_type_style_lang.762ed9cc.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/Youtube.ec5d7916.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/CodeBlock.6cef0479.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/CourseFloatingBanner.c1c08878.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1114/my/_app/immutable/chunks/FrameworkSwitchCourse.4480e339.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"QA Pipeline ထဲက Fast Tokenizers များ","local":"fast-tokenizers-in-the-qa-pipeline","sections":[{"title":"question-answering pipeline ကို အသုံးပြုခြင်း","local":"using-the-question-answering-pipeline","sections":[],"depth":2},{"title":"Question Answering အတွက် Model တစ်ခုကို အသုံးပြုခြင်း","local":"using-a-model-for-question-answering","sections":[],"depth":2},{"title":"Long Contexts တွေကို ကိုင်တွယ်ခြင်း","local":"handling-long-contexts","sections":[],"depth":2},{"title":"ဝေါဟာရ ရှင်းလင်းချက် (Glossary)","local":"ဝဟရ-ရငလငခက-glossary","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <div class="bg-white leading-none border border-gray-100 rounded-lg flex p-0.5 w-56 text-sm mb-4"><a class="flex justify-center flex-1 py-1.5 px-2.5 focus:outline-none !no-underline rounded-l bg-red-50 dark:bg-transparent text-red-600" href="?fw=pt"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><defs><clipPath id="a"><rect x="3.05" y="0.5" width="25.73" height="31" fill="none"></rect></clipPath></defs><g clip-path="url(#a)"><path d="M24.94,9.51a12.81,12.81,0,0,1,0,18.16,12.68,12.68,0,0,1-18,0,12.81,12.81,0,0,1,0-18.16l9-9V5l-.84.83-6,6a9.58,9.58,0,1,0,13.55,0ZM20.44,9a1.68,1.68,0,1,1,1.67-1.67A1.68,1.68,0,0,1,20.44,9Z" fill="#ee4c2c"></path></g></svg> Pytorch </a><a class="flex justify-center flex-1 py-1.5 px-2.5 focus:outline-none !no-underline rounded-r text-gray-500 filter grayscale" href="?fw=tf"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="0.94em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 274"><path d="M145.726 42.065v42.07l72.861 42.07v-42.07l-72.86-42.07zM0 84.135v42.07l36.43 21.03V105.17L0 84.135zm109.291 21.035l-36.43 21.034v126.2l36.43 21.035v-84.135l36.435 21.035v-42.07l-36.435-21.034V105.17z" fill="#E55B2D"></path><path d="M145.726 42.065L36.43 105.17v42.065l72.861-42.065v42.065l36.435-21.03v-84.14zM255.022 63.1l-36.435 21.035v42.07l36.435-21.035V63.1zm-72.865 84.135l-36.43 21.035v42.07l36.43-21.036v-42.07zm-36.43 63.104l-36.436-21.035v84.135l36.435-21.035V210.34z" fill="#ED8E24"></path><path d="M145.726 0L0 84.135l36.43 21.035l109.296-63.105l72.861 42.07L255.022 63.1L145.726 0zm0 126.204l-36.435 21.03l36.435 21.036l36.43-21.035l-36.43-21.03z" fill="#F8BF3C"></path></svg> TensorFlow </a></div> <div class="items-center shrink-0 min-w-[100px] max-sm:min-w-[50px] justify-end ml-auto flex" style="float: right; margin-left: 10px; display: inline-flex; position: relative; z-index: 10;"><div class="inline-flex rounded-md max-sm:rounded-sm"><button class="inline-flex items-center gap-1 max-sm:gap-0.5 h-6 max-sm:h-5 px-2 max-sm:px-1.5 text-[11px] max-sm:text-[9px] font-medium text-gray-800 border border-r-0 rounded-l-md max-sm:rounded-l-sm border-gray-200 bg-white hover:shadow-inner dark:border-gray-850 dark:bg-gray-950 dark:text-gray-200 dark:hover:bg-gray-800" aria-live="polite"><span class="inline-flex items-center justify-center rounded-md p-0.5 max-sm:p-0"><svg class="w-3 h-3 max-sm:w-2.5 max-sm:h-2.5" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg></span> <span>Copy page</span></button> <button class="inline-flex items-center justify-center w-6 max-sm:w-5 h-6 max-sm:h-5 disabled:pointer-events-none text-sm text-gray-500 hover:text-gray-700 dark:hover:text-white rounded-r-md max-sm:rounded-r-sm border border-l transition border-gray-200 bg-white hover:shadow-inner dark:border-gray-850 dark:bg-gray-950 dark:text-gray-200 dark:hover:bg-gray-800" aria-haspopup="menu" aria-expanded="false" aria-label="Open copy menu"><svg class="transition-transform text-gray-400 overflow-visible w-3 h-3 max-sm:w-2.5 max-sm:h-2.5 rotate-0" width="1em" height="1em" viewBox="0 0 12 7" fill="none" xmlns="http://www.w3.org/2000/svg"><path d="M1 1L6 6L11 1" stroke="currentColor"></path></svg></button></div> </div> <h1 class="relative group"><a id="fast-tokenizers-in-the-qa-pipeline" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#fast-tokenizers-in-the-qa-pipeline"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>QA Pipeline ထဲက Fast Tokenizers များ</span></h1> <div class="flex space-x-1 absolute z-10 right-0 top-0" style=""><a href="https://discuss.huggingface.co/t/chapter-6-questions" target="_blank"><img alt="Ask a Question" class="!m-0" src="https://img.shields.io/badge/Ask%20a%20question-ffcb4c.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgLTEgMTA0IDEwNiI+PGRlZnM+PHN0eWxlPi5jbHMtMXtmaWxsOiMyMzFmMjA7fS5jbHMtMntmaWxsOiNmZmY5YWU7fS5jbHMtM3tmaWxsOiMwMGFlZWY7fS5jbHMtNHtmaWxsOiMwMGE5NGY7fS5jbHMtNXtmaWxsOiNmMTVkMjI7fS5jbHMtNntmaWxsOiNlMzFiMjM7fTwvc3R5bGU+PC9kZWZzPjx0aXRsZT5EaXNjb3Vyc2VfbG9nbzwvdGl0bGU+PGcgaWQ9IkxheWVyXzIiPjxnIGlkPSJMYXllcl8zIj48cGF0aCBjbGFzcz0iY2xzLTEiIGQ9Ik01MS44NywwQzIzLjcxLDAsMCwyMi44MywwLDUxYzAsLjkxLDAsNTIuODEsMCw1Mi44MWw1MS44Ni0uMDVjMjguMTYsMCw1MS0yMy43MSw1MS01MS44N1M4MCwwLDUxLjg3LDBaIi8+PHBhdGggY2xhc3M9ImNscy0yIiBkPSJNNTIuMzcsMTkuNzRBMzEuNjIsMzEuNjIsMCwwLDAsMjQuNTgsNjYuNDFsLTUuNzIsMTguNEwzOS40LDgwLjE3YTMxLjYxLDMxLjYxLDAsMSwwLDEzLTYwLjQzWiIvPjxwYXRoIGNsYXNzPSJjbHMtMyIgZD0iTTc3LjQ1LDMyLjEyYTMxLjYsMzEuNiwwLDAsMS0zOC4wNSw0OEwxOC44Niw4NC44MmwyMC45MS0yLjQ3QTMxLjYsMzEuNiwwLDAsMCw3Ny40NSwzMi4xMloiLz48cGF0aCBjbGFzcz0iY2xzLTQiIGQ9Ik03MS42MywyNi4yOUEzMS42LDMxLjYsMCwwLDEsMzguOCw3OEwxOC44Niw4NC44MiwzOS40LDgwLjE3QTMxLjYsMzEuNiwwLDAsMCw3MS42MywyNi4yOVoiLz48cGF0aCBjbGFzcz0iY2xzLTUiIGQ9Ik0yNi40Nyw2Ny4xMWEzMS42MSwzMS42MSwwLDAsMSw1MS0zNUEzMS42MSwzMS42MSwwLDAsMCwyNC41OCw2Ni40MWwtNS43MiwxOC40WiIvPjxwYXRoIGNsYXNzPSJjbHMtNiIgZD0iTTI0LjU4LDY2LjQxQTMxLjYxLDMxLjYxLDAsMCwxLDcxLjYzLDI2LjI5YTMxLjYxLDMxLjYxLDAsMCwwLTQ5LDM5LjYzbC0zLjc2LDE4LjlaIi8+PC9nPjwvZz48L3N2Zz4="></a> <a href="https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/en/chapter6/section3b_pt.ipynb" target="_blank"><img alt="Open In Colab" class="!m-0" src="https://colab.research.google.com/assets/colab-badge.svg"></a> <a href="https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/en/chapter6/section3b_pt.ipynb" target="_blank"><img alt="Open In Studio Lab" class="!m-0" src="https://studiolab.sagemaker.aws/studiolab.svg"></a></div> <p data-svelte-h="svelte-1vlxlu3">ကျွန်တော်တို့ အခု <code>question-answering</code> pipeline ထဲကို နက်နက်နဲနဲ လေ့လာပြီး၊ ယခင်အပိုင်းက grouped entities တွေအတွက် ကျွန်တော်တို့ လုပ်ခဲ့သလိုပဲ၊ လက်ရှိမေးခွန်းရဲ့ အဖြေကို context ကနေ ရယူဖို့ offsets တွေကို ဘယ်လိုအကျိုးယူရမလဲဆိုတာ ကြည့်ရပါမယ်။ ထို့နောက် truncate လုပ်ခံရတဲ့ အလွန်ရှည်လျားတဲ့ contexts တွေကို ဘယ်လိုကိုင်တွယ်ရမလဲဆိုတာ မြင်ရပါမယ်။ သင် question answering task ကို စိတ်မဝင်စားဘူးဆိုရင် ဒီအပိုင်းကို ကျော်သွားနိုင်ပါတယ်။</p> <iframe class="w-full xl:w-4/6 h-80" src="https://www.youtube-nocookie.com/embed/_wxyB3j3mk4" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> <h2 class="relative group"><a id="using-the-question-answering-pipeline" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-the-question-answering-pipeline"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>question-answering pipeline ကို အသုံးပြုခြင်း</span></h2> <p data-svelte-h="svelte-e7y8d2"><a href="/course/chapter1">Chapter 1</a> မှာ ကျွန်တော်တို့ တွေ့ခဲ့ရတဲ့အတိုင်း၊ မေးခွန်းတစ်ခုရဲ့ အဖြေကို ရယူဖို့ <code>question-answering</code> pipeline ကို အခုလို အသုံးပြုနိုင်ပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> pipeline | |
| question_answerer = pipeline(<span class="hljs-string">"question-answering"</span>) | |
| context = <span class="hljs-string">""" | |
| 🤗 Transformers is backed by the three most popular deep learning libraries — Jax, PyTorch, and TensorFlow — with a seamless integration | |
| between them. It's straightforward to train your models with one before loading them for inference with the other. | |
| """</span> | |
| question = <span class="hljs-string">"Which deep learning libraries back 🤗 Transformers?"</span> | |
| question_answerer(question=question, context=context)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-string">'score'</span>: <span class="hljs-number">0.97773</span>, | |
| <span class="hljs-string">'start'</span>: <span class="hljs-number">78</span>, | |
| <span class="hljs-string">'end'</span>: <span class="hljs-number">105</span>, | |
| <span class="hljs-string">'answer'</span>: <span class="hljs-string">'Jax, PyTorch and TensorFlow'</span>}<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-p07pqn">တခြား pipelines တွေနဲ့ မတူဘဲ၊ model လက်ခံတဲ့ အမြင့်ဆုံးအရှည်ထက် ပိုရှည်တဲ့ texts တွေကို truncate လုပ်ပြီး split လုပ်လို့ မရတဲ့ (ဒါကြောင့် document တစ်ခုရဲ့ အဆုံးမှာ အချက်အလက်တွေ လွတ်သွားနိုင်တဲ့) တခြား pipelines တွေနဲ့ မတူဘဲ၊ ဒီ pipeline က အလွန်ရှည်လျားတဲ့ contexts တွေကို ကိုင်တွယ်နိုင်ပြီး၊ အဖြေက အဆုံးမှာ ရှိနေရင်တောင် မေးခွန်းရဲ့ အဖြေကို ပြန်ပေးပါလိမ့်မယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->long_context = <span class="hljs-string">""" | |
| 🤗 Transformers: State of the Art NLP | |
| 🤗 Transformers provides thousands of pretrained models to perform tasks on texts such as classification, information extraction, | |
| question answering, summarization, translation, text generation and more in over 100 languages. | |
| Its aim is to make cutting-edge NLP easier to use for everyone. | |
| 🤗 Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and | |
| then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and | |
| can be modified to enable quick research experiments. | |
| Why should I use transformers? | |
| 1. Easy-to-use state-of-the-art models: | |
| - High performance on NLU and NLG tasks. | |
| - Low barrier to entry for educators and practitioners. | |
| - Few user-facing abstractions with just three classes to learn. | |
| - A unified API for using all our pretrained models. | |
| - Lower compute costs, smaller carbon footprint: | |
| 2. Researchers can share trained models instead of always retraining. | |
| - Practitioners can reduce compute time and production costs. | |
| - Dozens of architectures with over 10,000 pretrained models, some in more than 100 languages. | |
| 3. Choose the right framework for every part of a model's lifetime: | |
| - Train state-of-the-art models in 3 lines of code. | |
| - Move a single model between TF2.0/PyTorch frameworks at will. | |
| - Seamlessly pick the right framework for training, evaluation and production. | |
| 4. Easily customize a model or an example to your needs: | |
| - We provide examples for each architecture to reproduce the results published by its original authors. | |
| - Model internals are exposed as consistently as possible. | |
| - Model files can be used independently of the library for quick experiments. | |
| 🤗 Transformers is backed by the three most popular deep learning libraries — Jax, PyTorch and TensorFlow — with a seamless integration | |
| between them. It's straightforward to train your models with one before loading them for inference with the other. | |
| """</span> | |
| question = <span class="hljs-string">"Which deep learning libraries back 🤗 Transformers?"</span> | |
| question_answerer(question=question, context=long_context)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-string">'score'</span>: <span class="hljs-number">0.97149</span>, | |
| <span class="hljs-string">'start'</span>: <span class="hljs-number">1892</span>, | |
| <span class="hljs-string">'end'</span>: <span class="hljs-number">1919</span>, | |
| <span class="hljs-string">'answer'</span>: <span class="hljs-string">'Jax, PyTorch and TensorFlow'</span>}<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-fz7cft">ဒါတွေအားလုံးကို ဘယ်လိုလုပ်ဆောင်လဲ ကြည့်ရအောင်။</p> <h2 class="relative group"><a id="using-a-model-for-question-answering" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-a-model-for-question-answering"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Question Answering အတွက် Model တစ်ခုကို အသုံးပြုခြင်း</span></h2> <p data-svelte-h="svelte-akjge3">အခြား pipelines တွေလိုပဲ၊ ကျွန်တော်တို့ input ကို tokenize လုပ်ပြီးမှ model ထဲကို ပို့ခြင်းဖြင့် စတင်ပါတယ်။ <code>question-answering</code> pipeline အတွက် default အားဖြင့် အသုံးပြုတဲ့ checkpoint က <a href="https://huggingface.co/distilbert-base-cased-distilled-squad" rel="nofollow"><code>distilbert-base-cased-distilled-squad</code></a> ဖြစ်ပါတယ်။ (နာမည်ထဲက “squad” က model ကို fine-tune လုပ်ခဲ့တဲ့ dataset ကနေ လာတာပါ; SQuAD dataset အကြောင်းကို <a href="/course/chapter7/7">Chapter 7</a> မှာ ပိုပြီး ဆွေးနွေးပါမယ်။)</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoTokenizer, AutoModelForQuestionAnswering | |
| model_checkpoint = <span class="hljs-string">"distilbert-base-cased-distilled-squad"</span> | |
| tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) | |
| model = AutoModelForQuestionAnswering.from_pretrained(model_checkpoint) | |
| inputs = tokenizer(question, context, return_tensors=<span class="hljs-string">"pt"</span>) | |
| outputs = model(**inputs)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1oom62t">ကျွန်တော်တို့ question နဲ့ context ကို pair အဖြစ် tokenize လုပ်ပြီး၊ question ကို အရင်ထားတယ်ဆိုတာ သတိပြုပါ။</p> <div class="flex justify-center" data-svelte-h="svelte-47wedv"><img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter6/question_tokens.svg" alt="An example of tokenization of question and context"> <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter6/question_tokens-dark.svg" alt="An example of tokenization of question and context"></div> <p data-svelte-h="svelte-16znpce">question answering အတွက် models တွေက ကျွန်တော်တို့ အခုထိ မြင်တွေ့ခဲ့ရတဲ့ models တွေနဲ့ နည်းနည်း ကွာခြားစွာ အလုပ်လုပ်ပါတယ်။ အပေါ်ကပုံကို ဥပမာအနေနဲ့ အသုံးပြုရင်၊ model ကို အဖြေစတင်တဲ့ token ရဲ့ index (ဒီနေရာမှာ 21) နဲ့ အဖြေအဆုံးသတ်တဲ့ token ရဲ့ index (ဒီနေရာမှာ 24) ကို ခန့်မှန်းဖို့ train လုပ်ထားပါတယ်။ ဒါကြောင့် ဒီ models တွေက logits tensor တစ်ခုတည်း ပြန်မပေးဘဲ နှစ်ခု ပြန်ပေးတာပါ- တစ်ခုက အဖြေရဲ့ start token နဲ့ ကိုက်ညီတဲ့ logits တွေအတွက်ဖြစ်ပြီး၊ နောက်တစ်ခုက အဖြေရဲ့ end token နဲ့ ကိုက်ညီတဲ့ logits တွေအတွက် ဖြစ်ပါတယ်။ ဒီကိစ္စမှာ ကျွန်တော်တို့မှာ tokens ၆၆ ခုပါဝင်တဲ့ input တစ်ခုတည်းသာ ရှိတဲ့အတွက်၊ ကျွန်တော်တို့ ရရှိတာက…</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->start_logits = outputs.start_logits | |
| end_logits = outputs.end_logits | |
| <span class="hljs-built_in">print</span>(start_logits.shape, end_logits.shape)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->torch.Size([<span class="hljs-number">1</span>, <span class="hljs-number">66</span>]) torch.Size([<span class="hljs-number">1</span>, <span class="hljs-number">66</span>])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-s41hk2">ဒီ logits တွေကို probabilities အဖြစ် ပြောင်းလဲဖို့၊ ကျွန်တော်တို့ softmax function ကို အသုံးပြုပါမယ်၊ ဒါပေမယ့် အဲဒါမတိုင်ခင်၊ context ရဲ့ အစိတ်အပိုင်း မဟုတ်တဲ့ indices တွေကို mask လုပ်ထားဖို့ သေချာအောင် လုပ်ဖို့လိုပါတယ်။ ကျွန်တော်တို့ရဲ့ input က <code>[CLS] question [SEP] context [SEP]</code> ဖြစ်တာကြောင့်၊ question ရဲ့ tokens တွေနဲ့ <code>[SEP]</code> token ကိုပါ mask လုပ်ဖို့လိုပါတယ်။ <code>[CLS]</code> token ကိုတော့ ထားရှိပါမယ်၊ ဘာလို့လဲဆိုတော့ တချို့ models တွေက အဖြေဟာ context ထဲမှာ မရှိဘူးဆိုတာ ပြဖို့ အဲဒါကို အသုံးပြုလို့ပါပဲ။</p> <p data-svelte-h="svelte-188uo6">ကျွန်တော်တို့ နောက်ပိုင်းမှာ softmax ကို အသုံးပြုမှာဖြစ်တဲ့အတွက်၊ mask လုပ်ချင်တဲ့ logits တွေကို ကြီးမားတဲ့ negative number တစ်ခုနဲ့ အစားထိုးဖို့ပဲ လိုပါတယ်။ ဒီနေရာမှာ၊ ကျွန်တော်တို့ <code>-10000</code> ကို အသုံးပြုပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> torch | |
| sequence_ids = inputs.sequence_ids() | |
| <span class="hljs-comment"># context ရဲ့ tokens တွေကလွဲပြီး အားလုံးကို Mask လုပ်ပါ</span> | |
| mask = [i != <span class="hljs-number">1</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> sequence_ids] | |
| <span class="hljs-comment"># [CLS] token ကို Unmask လုပ်ပါ</span> | |
| mask[<span class="hljs-number">0</span>] = <span class="hljs-literal">False</span> | |
| mask = torch.tensor(mask)[<span class="hljs-literal">None</span>] | |
| start_logits[mask] = -<span class="hljs-number">10000</span> | |
| end_logits[mask] = -<span class="hljs-number">10000</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-ae0zp4">အခု ကျွန်တော်တို့ မခန့်မှန်းချင်တဲ့ နေရာတွေနဲ့ ကိုက်ညီတဲ့ logits တွေကို မှန်ကန်စွာ mask လုပ်ပြီးသွားပြီဆိုတော့၊ softmax ကို အသုံးပြုနိုင်ပါပြီ။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->start_probabilities = torch.nn.functional.softmax(start_logits, dim=-<span class="hljs-number">1</span>)[<span class="hljs-number">0</span>] | |
| end_probabilities = torch.nn.functional.softmax(end_logits, dim=-<span class="hljs-number">1</span>)[<span class="hljs-number">0</span>]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-55syz3">ဒီအဆင့်မှာ၊ ကျွန်တော်တို့ start နဲ့ end probabilities တွေရဲ့ argmax ကို ယူနိုင်ပါတယ်၊ ဒါပေမယ့် start index က end index ထက် ပိုကြီးတဲ့ ရလဒ်နဲ့ အဆုံးသတ်နိုင်တာကြောင့်၊ နောက်ထပ် ကြိုတင်ကာကွယ်မှုအချို့ လုပ်ဖို့လိုပါတယ်။ ကျွန်တော်တို့ဟာ ဖြစ်နိုင်ခြေရှိတဲ့ <code>start_index</code> နဲ့ <code>end_index</code> တစ်ခုစီ ( <code>start_index <= end_index</code> ဖြစ်ရမယ့်) ရဲ့ probabilities တွေကို တွက်ချက်ပြီး၊ အမြင့်ဆုံး probability ရှိတဲ့ <code>(start_index, end_index)</code> tuple ကို ယူပါမယ်။</p> <p>“အဖြေက <code data-svelte-h="svelte-8eqb3b">start_index</code> မှာ စတယ်” နဲ့ “အဖြေက <code data-svelte-h="svelte-9cistc">end_index</code> မှာ ဆုံးတယ်” ဆိုတဲ့ events တွေဟာ သီးခြားစီ ဖြစ်တယ်လို့ ယူဆရင်၊ အဖြေက <code data-svelte-h="svelte-8eqb3b">start_index</code> မှာ စတင်ပြီး <code data-svelte-h="svelte-9cistc">end_index</code> မှာ ဆုံးတဲ့ probability က… | |
| <!-- HTML_TAG_START --><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mrow><mi mathvariant="normal">s</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">p</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">l</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">s</mi></mrow><mo stretchy="false">[</mo><mrow><mi mathvariant="normal">s</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">x</mi></mrow><mo stretchy="false">]</mo><mo>×</mo><mrow><mi mathvariant="normal">e</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">p</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">l</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">s</mi></mrow><mo stretchy="false">[</mo><mrow><mi mathvariant="normal">e</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">x</mi></mrow><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">\mathrm{start\_probabilities}[\mathrm{start\_index}] \times \mathrm{end\_probabilities}[\mathrm{end\_index}]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.06em;vertical-align:-0.31em;"></span><span class="mord"><span class="mord mathrm">start_probabilities</span></span><span class="mopen">[</span><span class="mord"><span class="mord mathrm">start_index</span></span><span class="mclose">]</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.06em;vertical-align:-0.31em;"></span><span class="mord"><span class="mord mathrm">end_probabilities</span></span><span class="mopen">[</span><span class="mord"><span class="mord mathrm">end_index</span></span><span class="mclose">]</span></span></span></span></span><!-- HTML_TAG_END --></p> <p>ဒါကြောင့် scores အားလုံးကို တွက်ချက်ဖို့၊ <code data-svelte-h="svelte-1kp3szf">start_index <= end_index</code> ဖြစ်တဲ့<!-- HTML_TAG_START --><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="normal">s</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">p</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">l</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">s</mi></mrow><mo stretchy="false">[</mo><mrow><mi mathvariant="normal">s</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">x</mi></mrow><mo stretchy="false">]</mo><mo>×</mo><mrow><mi mathvariant="normal">e</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">p</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">b</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">l</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">s</mi></mrow><mo stretchy="false">[</mo><mrow><mi mathvariant="normal">e</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">_</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">n</mi><mi mathvariant="normal">d</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">x</mi></mrow><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">\mathrm{start\_probabilities}[\mathrm{start\_index}] \times \mathrm{end\_probabilities}[\mathrm{end\_index}]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.06em;vertical-align:-0.31em;"></span><span class="mord"><span class="mord mathrm">start_probabilities</span></span><span class="mopen">[</span><span class="mord"><span class="mord mathrm">start_index</span></span><span class="mclose">]</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.06em;vertical-align:-0.31em;"></span><span class="mord"><span class="mord mathrm">end_probabilities</span></span><span class="mopen">[</span><span class="mord"><span class="mord mathrm">end_index</span></span><span class="mclose">]</span></span></span></span><!-- HTML_TAG_END --> products အားလုံးကို တွက်ချက်ဖို့ပဲ လိုပါတယ်။</p> <p data-svelte-h="svelte-yblm8o">ပထမဆုံး ဖြစ်နိုင်ခြေရှိတဲ့ products အားလုံးကို တွက်ချက်ကြစို့။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->scores = start_probabilities[:, <span class="hljs-literal">None</span>] * end_probabilities[<span class="hljs-literal">None</span>, :]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1dcr0fc">ထို့နောက် <code>start_index > end_index</code> ဖြစ်တဲ့ values တွေကို <code>0</code> နဲ့ သတ်မှတ်ခြင်းဖြင့် mask လုပ်ပါမယ် (အခြား probabilities တွေက အားလုံး positive numbers တွေပါ)။ <code>torch.triu()</code> function က 2D tensor ရဲ့ upper triangular part ကို ပြန်ပေးတာကြောင့်၊ ဒါက ကျွန်တော်တို့အတွက် masking ကို လုပ်ဆောင်ပေးပါလိမ့်မယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->scores = torch.triu(scores)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1mvh4qq">အခု ကျွန်တော်တို့ အမြင့်ဆုံး index ကို ရယူဖို့ပဲ လိုပါတယ်။ PyTorch က flattened tensor ထဲက index ကို ပြန်ပေးမှာဖြစ်တာကြောင့်၊ <code>start_index</code> နဲ့ <code>end_index</code> ကို ရရှိဖို့ floor division <code>//</code> နဲ့ modulus <code>%</code> operations တွေကို အသုံးပြုဖို့လိုပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->max_index = scores.argmax().item() | |
| start_index = max_index // scores.shape[<span class="hljs-number">1</span>] | |
| end_index = max_index % scores.shape[<span class="hljs-number">1</span>] | |
| <span class="hljs-built_in">print</span>(scores[start_index, end_index])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-pgwod7">ကျွန်တော်တို့ အခုထိ အပြီးသတ် မလုပ်ရသေးပါဘူး၊ ဒါပေမယ့် အနည်းဆုံး အဖြေအတွက် မှန်ကန်တဲ့ score ကို ရရှိနေပါပြီ (ယခင်အပိုင်းက ပထမဆုံးရလဒ်နဲ့ နှိုင်းယှဉ်ခြင်းဖြင့် စစ်ဆေးနိုင်ပါတယ်)။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-number">0.97773</span><!-- HTML_TAG_END --></pre></div> <blockquote class="tip" data-svelte-h="svelte-1s2d7iw"><p>✏️ <strong>စမ်းသပ်ကြည့်ပါ။</strong> ဖြစ်နိုင်ခြေအများဆုံး အဖြေငါးခုအတွက် start နဲ့ end indices တွေကို တွက်ချက်ပါ။</p></blockquote> <p data-svelte-h="svelte-1o6v35p">ကျွန်တော်တို့မှာ tokens တွေရဲ့ <code>start_index</code> နဲ့ <code>end_index</code> ရှိနေပြီဆိုတော့၊ အခု context ထဲက character indices တွေအဖြစ် ပြောင်းလဲဖို့ပဲ လိုပါတယ်။ ဒီနေရာမှာ offsets တွေက အလွန်အသုံးဝင်ပါလိမ့်မယ်။ ဒါတွေကို ယူပြီး token classification task မှာ ကျွန်တော်တို့ လုပ်ခဲ့သလိုပဲ အသုံးပြုနိုင်ပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->inputs_with_offsets = tokenizer(question, context, return_offsets_mapping=<span class="hljs-literal">True</span>) | |
| offsets = inputs_with_offsets[<span class="hljs-string">"offset_mapping"</span>] | |
| start_char, _ = offsets[start_index] | |
| _, end_char = offsets[end_index] | |
| answer = context[start_char:end_char]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-nvt3e2">အခု ကျွန်တော်တို့ ရလဒ်ရရှိဖို့ အရာအားလုံးကို format လုပ်ဖို့ပဲ လိုပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->result = { | |
| <span class="hljs-string">"answer"</span>: answer, | |
| <span class="hljs-string">"start"</span>: start_char, | |
| <span class="hljs-string">"end"</span>: end_char, | |
| <span class="hljs-string">"score"</span>: scores[start_index, end_index], | |
| } | |
| <span class="hljs-built_in">print</span>(result)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-string">'answer'</span>: <span class="hljs-string">'Jax, PyTorch and TensorFlow'</span>, | |
| <span class="hljs-string">'start'</span>: <span class="hljs-number">78</span>, | |
| <span class="hljs-string">'end'</span>: <span class="hljs-number">105</span>, | |
| <span class="hljs-string">'score'</span>: <span class="hljs-number">0.97773</span>}<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1pmzq90">ကောင်းပါပြီ! ဒါက ကျွန်တော်တို့ရဲ့ ပထမဥပမာနဲ့ အတူတူပါပဲ!</p> <blockquote class="tip" data-svelte-h="svelte-cd122r"><p>✏️ <strong>စမ်းသပ်ကြည့်ပါ။</strong> သင်အရင်က တွက်ချက်ခဲ့တဲ့ best scores တွေကို အသုံးပြုပြီး ဖြစ်နိုင်ခြေအများဆုံး အဖြေငါးခုကို ပြသပါ။ သင်ရဲ့ ရလဒ်တွေကို စစ်ဆေးဖို့၊ ပထမ pipeline ကို ပြန်သွားပြီး ခေါ်ဆိုတဲ့အခါ <code>top_k=5</code> ကို ထည့်သွင်းပေးပါ။</p></blockquote> <h2 class="relative group"><a id="handling-long-contexts" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#handling-long-contexts"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Long Contexts တွေကို ကိုင်တွယ်ခြင်း</span></h2> <p data-svelte-h="svelte-z5shyk">ကျွန်တော်တို့ ယခင်က ဥပမာအဖြစ် အသုံးပြုခဲ့တဲ့ question နဲ့ long context ကို tokenize လုပ်ဖို့ ကြိုးစားမယ်ဆိုရင်၊ <code>question-answering</code> pipeline က လက်ခံတဲ့ အမြင့်ဆုံးအရှည် (384) ထက် ပိုများတဲ့ tokens အရေအတွက်ကို ရရှိပါလိမ့်မယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->inputs = tokenizer(question, long_context) | |
| <span class="hljs-built_in">print</span>(<span class="hljs-built_in">len</span>(inputs[<span class="hljs-string">"input_ids"</span>]))<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-number">461</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-oyxfnn">ဒါကြောင့်၊ ကျွန်တော်တို့ inputs တွေကို အဲဒီအမြင့်ဆုံးအရှည်မှာ truncate လုပ်ဖို့ လိုပါလိမ့်မယ်။ ဒီလိုလုပ်ဖို့ နည်းလမ်းများစွာရှိပါတယ်၊ ဒါပေမယ့် ကျွန်တော်တို့ question ကို truncate လုပ်ချင်တာ မဟုတ်ဘဲ context ကိုသာ truncate လုပ်ချင်တာပါ။ context က ဒုတိယ sentence ဖြစ်တာကြောင့်၊ ကျွန်တော်တို့ <code>"only_second"</code> truncation strategy ကို အသုံးပြုပါမယ်။ အဲဒီအခါမှာ ဖြစ်ပေါ်လာတဲ့ ပြဿနာကတော့ မေးခွန်းရဲ့ အဖြေက truncated context ထဲမှာ မပါဝင်နိုင်တာပါပဲ။ ဒီနေရာမှာ ဥပမာအားဖြင့်၊ ကျွန်တော်တို့ အဖြေက context ရဲ့ အဆုံးနားမှာရှိတဲ့ မေးခွန်းတစ်ခုကို ရွေးချယ်ခဲ့ပြီး၊ ကျွန်တော်တို့ truncate လုပ်တဲ့အခါ အဲဒီအဖြေက မရှိပါဘူး။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->inputs = tokenizer(question, long_context, max_length=<span class="hljs-number">384</span>, truncation=<span class="hljs-string">"only_second"</span>) | |
| <span class="hljs-built_in">print</span>(tokenizer.decode(inputs[<span class="hljs-string">"input_ids"</span>]))<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-string">""" | |
| [CLS] Which deep learning libraries back [UNK] Transformers? [SEP] [UNK] Transformers : State of the Art NLP | |
| [UNK] Transformers provides thousands of pretrained models to perform tasks on texts such as classification, information extraction, | |
| question answering, summarization, translation, text generation and more in over 100 languages. | |
| Its aim is to make cutting-edge NLP easier to use for everyone. | |
| [UNK] Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and | |
| then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and | |
| can be modified to enable quick research experiments. | |
| Why should I use transformers? | |
| 1. Easy-to-use state-of-the-art models: | |
| - High performance on NLU and NLG tasks. | |
| - Low barrier to entry for educators and practitioners. | |
| - Few user-facing abstractions with just three classes to learn. | |
| - A unified API for using all our pretrained models. | |
| - Lower compute costs, smaller carbon footprint: | |
| 2. Researchers can share trained models instead of always retraining. | |
| - Practitioners can reduce compute time and production costs. | |
| - Dozens of architectures with over 10,000 pretrained models, some in more than 100 languages. | |
| 3. Choose the right framework for every part of a model's lifetime: | |
| - Train state-of-the-art models in 3 lines of code. | |
| - Move a single model between TF2.0/PyTorch frameworks at will. | |
| - Seamlessly pick the right framework for training, evaluation and production. | |
| 4. Easily customize a model or an example to your needs: | |
| - We provide examples for each architecture to reproduce the results published by its original authors. | |
| - Model internal [SEP] | |
| """</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-8c578">ဒါက model က မှန်ကန်တဲ့ အဖြေကို ရွေးချယ်ရာမှာ ခက်ခဲစေလိမ့်မယ်လို့ ဆိုလိုပါတယ်။ ဒါကို ဖြေရှင်းဖို့၊ <code>question-answering</code> pipeline က context ကို သေးငယ်တဲ့ chunks တွေအဖြစ် ခွဲထုတ်နိုင်စေပြီး၊ အမြင့်ဆုံးအရှည်ကို သတ်မှတ်နိုင်ပါတယ်။ အဖြေကို ရှာမတွေ့အောင် အတိအကျမှားယွင်းတဲ့နေရာမှာ context ကို မခွဲမိစေဖို့ သေချာစေရန်၊ ၎င်းက chunks တွေကြားမှာ overlap အချို့ကိုလည်း ထည့်သွင်းထားပါတယ်။</p> <p data-svelte-h="svelte-y8j3z8">tokenizer (fast ဒါမှမဟုတ် slow) က ဒါကို ကျွန်တော်တို့အတွက် လုပ်ပေးနိုင်ပါတယ်။ <code>return_overflowing_tokens=True</code> ကို ထည့်သွင်းခြင်းဖြင့်ဖြစ်ပြီး၊ <code>stride</code> argument နဲ့ လိုချင်တဲ့ overlap ကို သတ်မှတ်နိုင်ပါတယ်။ ဒီမှာတော့ သေးငယ်တဲ့ sentence တစ်ခုကို အသုံးပြုထားတဲ့ ဥပမာတစ်ခုပါ…</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->sentence = <span class="hljs-string">"This sentence is not too long but we are going to split it anyway."</span> | |
| inputs = tokenizer( | |
| sentence, truncation=<span class="hljs-literal">True</span>, return_overflowing_tokens=<span class="hljs-literal">True</span>, max_length=<span class="hljs-number">6</span>, stride=<span class="hljs-number">2</span> | |
| ) | |
| <span class="hljs-keyword">for</span> ids <span class="hljs-keyword">in</span> inputs[<span class="hljs-string">"input_ids"</span>]: | |
| <span class="hljs-built_in">print</span>(tokenizer.decode(ids))<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-string">'[CLS] This sentence is not [SEP]'</span> | |
| <span class="hljs-string">'[CLS] is not too long [SEP]'</span> | |
| <span class="hljs-string">'[CLS] too long but we [SEP]'</span> | |
| <span class="hljs-string">'[CLS] but we are going [SEP]'</span> | |
| <span class="hljs-string">'[CLS] are going to split [SEP]'</span> | |
| <span class="hljs-string">'[CLS] to split it anyway [SEP]'</span> | |
| <span class="hljs-string">'[CLS] it anyway. [SEP]'</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-727isk">ကျွန်တော်တို့ မြင်တွေ့ရတဲ့အတိုင်း၊ sentence ကို chunks တွေအဖြစ် ခွဲထုတ်ထားတဲ့အတွက် <code>inputs["input_ids"]</code> ထဲက entry တစ်ခုစီမှာ အများဆုံး tokens ၆ ခု ပါဝင်ပါတယ် (ကျန်တဲ့ entry တွေကို တူညီတဲ့ အရွယ်အစားဖြစ်အောင် padding ထည့်ဖို့ လိုပါလိမ့်မယ်)၊ ပြီးတော့ entries တစ်ခုစီကြားမှာ tokens ၂ ခု overlap ဖြစ်နေပါတယ်။</p> <p data-svelte-h="svelte-jx4y7j">tokenization ရဲ့ ရလဒ်ကို ပိုပြီး အနီးကပ်ကြည့်ရအောင်-</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-built_in">print</span>(inputs.keys())<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->dict_keys([<span class="hljs-string">'input_ids'</span>, <span class="hljs-string">'attention_mask'</span>, <span class="hljs-string">'overflow_to_sample_mapping'</span>])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1o9oe9c">မျှော်လင့်ထားတဲ့အတိုင်း၊ ကျွန်တော်တို့ input IDs နဲ့ attention mask ကို ရရှိပါတယ်။ နောက်ဆုံး key ဖြစ်တဲ့ <code>overflow_to_sample_mapping</code> ကတော့ results တစ်ခုစီ ဘယ် sentence ကနေ လာလဲဆိုတာကို ပြောပြတဲ့ map တစ်ခုပါ — ဒီနေရာမှာ ကျွန်တော်တို့ tokenizer ကို ပေးခဲ့တဲ့ (တစ်ခုတည်းသော) sentence ကနေ လာတဲ့ results ၇ ခု ရှိပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-built_in">print</span>(inputs[<span class="hljs-string">"overflow_to_sample_mapping"</span>])<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->[<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1q1isry">ဒါက sentences အများအပြားကို အတူတကွ tokenize လုပ်တဲ့အခါ ပိုပြီး အသုံးဝင်ပါတယ်။ ဥပမာအားဖြင့်…</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->sentences = [ | |
| <span class="hljs-string">"This sentence is not too long but we are going to split it anyway."</span>, | |
| <span class="hljs-string">"This sentence is shorter but will still get split."</span>, | |
| ] | |
| inputs = tokenizer( | |
| sentences, truncation=<span class="hljs-literal">True</span>, return_overflowing_tokens=<span class="hljs-literal">True</span>, max_length=<span class="hljs-number">6</span>, stride=<span class="hljs-number">2</span> | |
| ) | |
| <span class="hljs-built_in">print</span>(inputs[<span class="hljs-string">"overflow_to_sample_mapping"</span>])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1c72y1d">ဒါက ကျွန်တော်တို့ကို…</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->[<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-8mqxw1">လို့ ပြသပါတယ်၊ ဒါက ပထမ sentence ကို chunks ၇ ခုအဖြစ် ခွဲထားပြီး၊ နောက်ထပ် chunks ၄ ခုက ဒုတိယ sentence ကနေ လာတာ ဖြစ်ပါတယ်။</p> <p data-svelte-h="svelte-18867pw">အခု ကျွန်တော်တို့ရဲ့ long context ကို ပြန်သွားကြစို့။ default အားဖြင့် <code>question-answering</code> pipeline က အမြင့်ဆုံးအရှည် 384 ကို အသုံးပြုပါတယ်၊ ကျွန်တော်တို့ အစောပိုင်းမှာ ဖော်ပြခဲ့တဲ့အတိုင်းပါပဲ၊ ပြီးတော့ stride 128 ကို အသုံးပြုပါတယ်။ ဒါတွေဟာ model ကို fine-tune လုပ်ခဲ့တဲ့ ပုံစံနဲ့ ကိုက်ညီပါတယ် (pipeline ကို ခေါ်ဆိုတဲ့အခါ <code>max_seq_len</code> နဲ့ <code>stride</code> arguments တွေကို ထည့်သွင်းပေးခြင်းဖြင့် ဒီ parameters တွေကို ချိန်ညှိနိုင်ပါတယ်)။ ဒါကြောင့် tokenize လုပ်တဲ့အခါ အဲဒီ parameters တွေကို ကျွန်တော်တို့ အသုံးပြုပါမယ်။ padding ကိုလည်း ထည့်ပါမယ် (တူညီတဲ့ အရှည်ရှိတဲ့ samples တွေရဖို့၊ ဒါမှ tensors တွေ တည်ဆောက်နိုင်ဖို့) အပြင် offsets တွေကိုလည်း တောင်းဆိုပါမယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->inputs = tokenizer( | |
| question, | |
| long_context, | |
| stride=<span class="hljs-number">128</span>, | |
| max_length=<span class="hljs-number">384</span>, | |
| padding=<span class="hljs-string">"longest"</span>, | |
| truncation=<span class="hljs-string">"only_second"</span>, | |
| return_overflowing_tokens=<span class="hljs-literal">True</span>, | |
| return_offsets_mapping=<span class="hljs-literal">True</span>, | |
| )<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-13xf1wt">အဲဒီ <code>inputs</code> တွေမှာ model က မျှော်လင့်ထားတဲ့ input IDs နဲ့ attention masks တွေအပြင်၊ offsets တွေနဲ့ ကျွန်တော်တို့ အခုလေးတင် ပြောခဲ့တဲ့ <code>overflow_to_sample_mapping</code> တွေ ပါဝင်ပါလိမ့်မယ်။ အဲဒီနှစ်ခုက model က အသုံးပြုတဲ့ parameters တွေ မဟုတ်တာကြောင့်၊ tensor အဖြစ် မပြောင်းလဲခင် ၎င်းတို့ကို <code>inputs</code> ကနေ ဖယ်ရှားပါမယ် (ဒီနေရာမှာ အသုံးမဝင်တဲ့အတွက် map ကို သိမ်းဆည်းထားမှာ မဟုတ်ပါဘူး)။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->_ = inputs.pop(<span class="hljs-string">"overflow_to_sample_mapping"</span>) | |
| offsets = inputs.pop(<span class="hljs-string">"offset_mapping"</span>) | |
| inputs = inputs.convert_to_tensors(<span class="hljs-string">"pt"</span>) | |
| <span class="hljs-built_in">print</span>(inputs[<span class="hljs-string">"input_ids"</span>].shape)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->torch.Size([<span class="hljs-number">2</span>, <span class="hljs-number">384</span>])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1f66pnh">ကျွန်တော်တို့ရဲ့ long context ကို နှစ်ပိုင်းခွဲထားတာကြောင့်၊ model ထဲကို ဖြတ်သန်းပြီးနောက်မှာ၊ start နဲ့ end logits အစုနှစ်ခု ရရှိပါလိမ့်မယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->outputs = model(**inputs) | |
| start_logits = outputs.start_logits | |
| end_logits = outputs.end_logits | |
| <span class="hljs-built_in">print</span>(start_logits.shape, end_logits.shape)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->torch.Size([<span class="hljs-number">2</span>, <span class="hljs-number">384</span>]) torch.Size([<span class="hljs-number">2</span>, <span class="hljs-number">384</span>])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1uoj2dz">အရင်လိုပဲ၊ softmax ကို အသုံးမပြုခင် context ရဲ့ အစိတ်အပိုင်း မဟုတ်တဲ့ tokens တွေကို အရင် mask လုပ်ပါတယ်။ padding tokens အားလုံးကိုလည်း mask လုပ်ပါတယ် (attention mask ကနေ ဖော်ပြထားတဲ့အတိုင်းပါ)။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->sequence_ids = inputs.sequence_ids() | |
| <span class="hljs-comment"># context ရဲ့ tokens တွေကလွဲပြီး အားလုံးကို Mask လုပ်ပါ</span> | |
| mask = [i != <span class="hljs-number">1</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> sequence_ids] | |
| <span class="hljs-comment"># [CLS] token ကို Unmask လုပ်ပါ</span> | |
| mask[<span class="hljs-number">0</span>] = <span class="hljs-literal">False</span> | |
| <span class="hljs-comment"># [PAD] tokens အားလုံးကို Mask လုပ်ပါ</span> | |
| mask = torch.logical_or(torch.tensor(mask)[<span class="hljs-literal">None</span>], (inputs[<span class="hljs-string">"attention_mask"</span>] == <span class="hljs-number">0</span>)) | |
| start_logits[mask] = -<span class="hljs-number">10000</span> | |
| end_logits[mask] = -<span class="hljs-number">10000</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1aqiux2">ထို့နောက် softmax ကို အသုံးပြုပြီး ကျွန်တော်တို့ရဲ့ logits တွေကို probabilities အဖြစ် ပြောင်းလဲနိုင်ပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->start_probabilities = torch.nn.functional.softmax(start_logits, dim=-<span class="hljs-number">1</span>) | |
| end_probabilities = torch.nn.functional.softmax(end_logits, dim=-<span class="hljs-number">1</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-5sgrcl">နောက်တစ်ဆင့်က small context အတွက် ကျွန်တော်တို့ လုပ်ခဲ့တာနဲ့ ဆင်တူပါတယ်၊ ဒါပေမယ့် ကျွန်တော်တို့ရဲ့ chunks နှစ်ခုစီအတွက် ပြန်လည်လုပ်ဆောင်တာပါ။ ဖြစ်နိုင်ခြေရှိတဲ့ အဖြေ spans အားလုံးကို score ပေးပြီး၊ အကောင်းဆုံး score ရှိတဲ့ span ကို ယူပါတယ်။</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->candidates = [] | |
| <span class="hljs-keyword">for</span> start_probs, end_probs <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(start_probabilities, end_probabilities): | |
| scores = start_probs[:, <span class="hljs-literal">None</span>] * end_probs[<span class="hljs-literal">None</span>, :] | |
| idx = torch.triu(scores).argmax().item() | |
| start_idx = idx // scores.shape[<span class="hljs-number">1</span>] | |
| end_idx = idx % scores.shape[<span class="hljs-number">1</span>] | |
| score = scores[start_idx, end_idx].item() | |
| candidates.append((start_idx, end_idx, score)) | |
| <span class="hljs-built_in">print</span>(candidates)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->[(<span class="hljs-number">0</span>, <span class="hljs-number">18</span>, <span class="hljs-number">0.33867</span>), (<span class="hljs-number">173</span>, <span class="hljs-number">184</span>, <span class="hljs-number">0.97149</span>)]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1kvxn87">အဲဒီ candidates နှစ်ခုက model က chunk တစ်ခုစီမှာ ရှာဖွေနိုင်ခဲ့တဲ့ အကောင်းဆုံးအဖြေတွေနဲ့ ကိုက်ညီပါတယ်။ model က မှန်ကန်တဲ့ အဖြေဟာ ဒုတိယအပိုင်းမှာ ရှိတယ်လို့ ပိုပြီး ယုံကြည်မှုရှိပါတယ် (ဒါက ကောင်းတဲ့ လက္ခဏာပါပဲ!)။ အခု ကျွန်တော်တို့ အဲဒီ token spans နှစ်ခုကို context ထဲက character spans တွေအဖြစ် map လုပ်ဖို့ပဲ လိုပါတယ် (ကျွန်တော်တို့ အဖြေရဖို့ ဒုတိယတစ်ခုကိုပဲ map လုပ်ဖို့ လိုအပ်ပါတယ်၊ ဒါပေမယ့် ပထမ chunk မှာ model က ဘာရွေးခဲ့လဲဆိုတာ ကြည့်ရတာ စိတ်ဝင်စားစရာပါ)။</p> <blockquote class="tip" data-svelte-h="svelte-1c2et8e"><p>✏️ <strong>စမ်းသပ်ကြည့်ပါ။</strong> ဖြစ်နိုင်ခြေအများဆုံး အဖြေငါးခုအတွက် scores နဲ့ spans တွေကို ပြန်ပေးဖို့ အပေါ်က code ကို ပြောင်းလဲပါ။</p></blockquote> <p data-svelte-h="svelte-o0nnah">ကျွန်တော်တို့ အရင်က ယူခဲ့တဲ့ <code>offsets</code> တွေက တကယ်တော့ offsets တွေရဲ့ list တစ်ခုဖြစ်ပြီး၊ text chunk တစ်ခုစီအတွက် list တစ်ခုစီ ပါဝင်ပါတယ်-</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">for</span> candidate, offset <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(candidates, offsets): | |
| start_token, end_token, score = candidate | |
| start_char, _ = offset[start_token] | |
| _, end_char = offset[end_token] | |
| answer = long_context[start_char:end_char] | |
| result = {<span class="hljs-string">"answer"</span>: answer, <span class="hljs-string">"start"</span>: start_char, <span class="hljs-string">"end"</span>: end_char, <span class="hljs-string">"score"</span>: score} | |
| <span class="hljs-built_in">print</span>(result)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-string">'answer'</span>: <span class="hljs-string">'\n🤗 Transformers: State of the Art NLP'</span>, <span class="hljs-string">'start'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'end'</span>: <span class="hljs-number">37</span>, <span class="hljs-string">'score'</span>: <span class="hljs-number">0.33867</span>} | |
| {<span class="hljs-string">'answer'</span>: <span class="hljs-string">'Jax, PyTorch and TensorFlow'</span>, <span class="hljs-string">'start'</span>: <span class="hljs-number">1892</span>, <span class="hljs-string">'end'</span>: <span class="hljs-number">1919</span>, <span class="hljs-string">'score'</span>: <span class="hljs-number">0.97149</span>}<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-6g3lc9">ပထမရလဒ်ကို လျစ်လျူရှုမယ်ဆိုရင်၊ ဒီ long context အတွက် ကျွန်တော်တို့ရဲ့ pipeline နဲ့ တူညီတဲ့ ရလဒ်ကို ရရှိပါတယ်၊ ကောင်းပါပြီ!</p> <blockquote class="tip" data-svelte-h="svelte-sq1bnq"><p>✏️ <strong>စမ်းသပ်ကြည့်ပါ။</strong> သင်အရင်က တွက်ချက်ခဲ့တဲ့ best scores တွေကို အသုံးပြုပြီး ဖြစ်နိုင်ခြေအများဆုံး အဖြေငါးခုကို ပြသပါ။ (context တစ်ခုလုံးအတွက်၊ chunk တစ်ခုစီအတွက် မဟုတ်ပါ)။ သင်ရဲ့ ရလဒ်တွေကို စစ်ဆေးဖို့၊ ပထမ pipeline ကို ပြန်သွားပြီး ခေါ်ဆိုတဲ့အခါ <code>top_k=5</code> ကို ထည့်သွင်းပေးပါ။</p></blockquote> <p data-svelte-h="svelte-1kltyld">ဒါက tokenizer ရဲ့ စွမ်းဆောင်ရည်တွေကို နက်နက်နဲနဲ လေ့လာခြင်းကို နိဂုံးချုပ်လိုက်ပါပြီ။ နောက်အခန်းမှာ၊ common NLP tasks အမျိုးမျိုးပေါ်မှာ model တစ်ခုကို ဘယ်လို fine-tune လုပ်ရမလဲဆိုတာ ပြသတဲ့အခါ ဒါတွေအားလုံးကို ပြန်လည်အသုံးချသွားပါမယ်။</p> <h2 class="relative group"><a id="ဝဟရ-ရငလငခက-glossary" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#ဝဟရ-ရငလငခက-glossary"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>ဝေါဟာရ ရှင်းလင်းချက် (Glossary)</span></h2> <ul data-svelte-h="svelte-om79ig"><li><strong>QA Pipeline (Question-Answering Pipeline)</strong>: မေးခွန်းတစ်ခုကို စာသား document တစ်ခုမှ အဖြေရှာရန် ဒီဇိုင်းထုတ်ထားသော 🤗 Transformers library ရှိ <code>pipeline()</code> function။</li> <li><strong>Fast Tokenizers</strong>: Rust ဘာသာစကားဖြင့် အကောင်အထည်ဖော်ထားသော tokenizers များဖြစ်ပြီး Python-based “slow” tokenizers များထက် အလွန်မြန်ဆန်သည်။</li> <li><strong>Offsets</strong>: token တစ်ခုစီသည် မူရင်းစာသား၏ မည်သည့်စတင်ခြင်းနှင့် အဆုံးသတ် character index များကြားတွင် ရှိနေသည်ကို ဖော်ပြသော map။</li> <li><strong>Context</strong>: မေးခွန်းတစ်ခုကို ဖြေရန်အတွက် ပေးထားသော စာသားအပိုဒ်။</li> <li><strong>Truncated</strong>: အရှည်ကို လျှော့ချရန်အတွက် ဖြတ်တောက်ခြင်း။</li> <li><strong><code>pipeline()</code> Function</strong>: Hugging Face Transformers library မှာ ပါဝင်တဲ့ လုပ်ဆောင်ချက်တစ်ခုဖြစ်ပြီး မော်ဒယ်တွေကို သီးခြားလုပ်ငန်းတာဝန်များ (ဥပမာ- စာသားခွဲခြားသတ်မှတ်ခြင်း၊ စာသားထုတ်လုပ်ခြင်း) အတွက် အသုံးပြုရလွယ်ကူအောင် ပြုလုပ်ပေးပါတယ်။</li> <li><strong>Deep Learning Libraries</strong>: Deep learning မော်ဒယ်များကို တည်ဆောက်ရန်၊ လေ့ကျင့်ရန်နှင့် အသုံးပြုရန်အတွက် ကိရိယာများနှင့် library များ (ဥပမာ- Jax, PyTorch, TensorFlow)။</li> <li><strong>Jax</strong>: Google မှ ထုတ်လုပ်သော high-performance numerical computation library။</li> <li><strong>PyTorch</strong>: Facebook (ယခု Meta) က ဖန်တီးထားတဲ့ open-source machine learning library တစ်ခုဖြစ်ပြီး deep learning မော်ဒယ်တွေ တည်ဆောက်ဖို့အတွက် အသုံးပြုပါတယ်။</li> <li><strong>TensorFlow</strong>: Google က ဖန်တီးထားတဲ့ open-source machine learning library တစ်ခုဖြစ်ပြီး deep learning မော်ဒယ်တွေ တည်ဆောက်ဖို့အတွက် အသုံးပြုပါတယ်။</li> <li><strong>Seamless Integration</strong>: ကွဲပြားခြားနားသော အစိတ်အပိုင်းများကြားတွင် ချောမွေ့စွာ ပေါင်းစပ်အလုပ်လုပ်နိုင်ခြင်း။</li> <li><strong>Inference</strong>: လေ့ကျင့်ပြီးသား Artificial Intelligence (AI) မော်ဒယ်တစ်ခုကို အသုံးပြုပြီး input data ကနေ ခန့်မှန်းချက်တွေ ဒါမှမဟုတ် output တွေကို ထုတ်လုပ်တဲ့ လုပ်ငန်းစဉ်။</li> <li><strong>Model</strong>: Artificial Intelligence (AI) နယ်ပယ်တွင် အချက်အလက်များကို လေ့လာပြီး ခန့်မှန်းချက်များ ပြုလုပ်ရန် ဒီဇိုင်းထုတ်ထားသော သင်္ချာဆိုင်ရာဖွဲ့စည်းပုံများ။</li> <li><strong>Truncate (Text)</strong>: စာသား sequence တစ်ခုကို သတ်မှတ်ထားသော အရှည်တစ်ခုအထိ ဖြတ်တောက်ခြင်း။</li> <li><strong>Maximum Length</strong>: Model တစ်ခုလက်ခံနိုင်သော အမြင့်ဆုံး input sequence အရှည်။</li> <li><strong>Long Contexts</strong>: အလွန်ရှည်လျားသော စာသားအပိုဒ်များ။</li> <li><strong><code>distilbert-base-cased-distilled-squad</code></strong>: SQuAD dataset ပေါ်တွင် fine-tune လုပ်ထားသော DistilBERT cased model အတွက် Hugging Face Hub ရှိ ID။</li> <li><strong>SQuAD Dataset (Stanford Question Answering Dataset)</strong>: မေးခွန်းဖြေဆိုခြင်း (Question Answering) အတွက် လူသိများသော dataset တစ်ခု။</li> <li><strong>Fine-tuned</strong>: ကြိုတင်လေ့ကျင့်ထားပြီးသား (pre-trained) မော်ဒယ်တစ်ခုကို သီးခြားလုပ်ငန်းတစ်ခု (specific task) အတွက် အနည်းငယ်သော ဒေတာနဲ့ ထပ်မံလေ့ကျင့်ပေးခြင်းကို ဆိုလိုပါတယ်။</li> <li><strong><code>AutoTokenizer</code></strong>: Hugging Face Transformers library မှာ ပါဝင်တဲ့ class တစ်ခုဖြစ်ပြီး မော်ဒယ်အမည်ကို အသုံးပြုပြီး သက်ဆိုင်ရာ tokenizer ကို အလိုအလျောက် load လုပ်ပေးသည်။</li> <li><strong><code>AutoModelForQuestionAnswering</code></strong>: Hugging Face Transformers library မှ question answering task အတွက် model class ကို အလိုအလျောက် load လုပ်ပေးသော class။</li> <li><strong><code>TFAutoModelForQuestionAnswering</code></strong>: TensorFlow framework အတွက် question answering task အတွက် model class ကို အလိုအလျောက် load လုပ်ပေးသော class။</li> <li><strong><code>model_checkpoint</code></strong>: Pretrained model ၏ ID။</li> <li><strong><code>return_tensors="pt"</code> / <code>"tf"</code></strong>: Tokenizer မှ output tensors များကို PyTorch (<code>"pt"</code>) သို့မဟုတ် TensorFlow (<code>"tf"</code>) format ဖြင့် ပြန်ပေးရန် သတ်မှတ်ခြင်း။</li> <li><strong>Tokens</strong>: စာသားကို ပိုင်းခြားထားသော အခြေခံယူနစ်များ။</li> <li><strong>Logits</strong>: Model ၏ output ဖြစ်ပြီး raw, unnormalized scores များကို ဖော်ပြသည်။</li> <li><strong>Start Logits</strong>: အဖြေစတင်မည့် token ၏ အနေအထား (index) ကို ခန့်မှန်းရန် model မှ ထုတ်ပေးသော logits များ။</li> <li><strong>End Logits</strong>: အဖြေအဆုံးသတ်မည့် token ၏ အနေအထား (index) ကို ခန့်မှန်းရန် model မှ ထုတ်ပေးသော logits များ။</li> <li><strong>Tensor</strong>: Machine Learning frameworks (PyTorch, TensorFlow) များတွင် ဒေတာများကို ကိုယ်စားပြုသော multi-dimensional array များ။</li> <li><strong><code>[CLS]</code> Token</strong>: BERT model တွင် sequence ၏ အစကို ကိုယ်စားပြုသော special token။</li> <li><strong><code>[SEP]</code> Token</strong>: BERT model တွင် sentence တစ်ခု၏ အဆုံး သို့မဟုတ် sentence နှစ်ခုကြား ပိုင်းခြားရန် အသုံးပြုသော special token။</li> <li><strong>Mask (Logits)</strong>: မလိုချင်သော နေရာများမှ logits များကို အလွန်ကြီးမားသော negative number ဖြင့် အစားထိုးခြင်းဖြင့် ၎င်းတို့၏ probability ကို သုညနီးပါး ဖြစ်စေခြင်း။</li> <li><strong>Softmax Function</strong>: ဂဏန်းတန်ဖိုးများ (logits) အစုအဝေးတစ်ခုကို probability distribution (ပေါင်းလဒ် ၁ ဖြစ်သော တန်ဖိုးများ) အဖြစ် ပြောင်းလဲပေးသော သင်္ချာဆိုင်ရာ function။</li> <li><strong>Probabilities</strong>: ဖြစ်နိုင်ခြေ တန်ဖိုးများ။</li> <li><strong>Argmax</strong>: array တစ်ခုအတွင်းရှိ အမြင့်ဆုံးတန်ဖိုး၏ index ကို ပြန်ပေးသော function။</li> <li><strong><code>start_index</code></strong>: အဖြေစတင်မည့် token ၏ index။</li> <li><strong><code>end_index</code></strong>: အဖြေအဆုံးသတ်မည့် token ၏ index။</li> <li><strong><code>scores = start_probabilities[:, None] * end_probabilities[None, :]</code></strong>: start နှင့် end probabilities များကို မြှောက်ခြင်းဖြင့် ဖြစ်နိုင်ခြေရှိသော answer span အားလုံးအတွက် score matrix ကို တွက်ချက်ခြင်း။ <code>[:, None]</code> နှင့် <code>[None, :]</code> သည် broadcasting အတွက် dimension ထည့်ပေးသည်။</li> <li><strong><code>torch.triu()</code> / <code>np.triu()</code></strong>: PyTorch/NumPy မှ function တစ်ခုဖြစ်ပြီး matrix တစ်ခု၏ upper triangular part ကို ပြန်ပေးသည်။ <code>start_index > end_index</code> ဖြစ်သော scores များကို <code>0</code> ပြုလုပ်ရန် အသုံးပြုသည်။</li> <li><strong><code>argmax()</code></strong>: array တစ်ခုအတွင်းရှိ အမြင့်ဆုံးတန်ဖိုး၏ index ကို ပြန်ပေးသော method။</li> <li><strong><code>item()</code> Method</strong>: PyTorch/NumPy tensor မှ single element value ကို Python standard type အဖြစ် ပြောင်းလဲပေးသော method။</li> <li><strong>Floor Division (<code>//</code>)</strong>: အကြွင်းမပါသော စားခြင်း။</li> <li><strong>Modulus (<code>%</code>)</strong>: စားပြီးနောက် ကျန်ရှိသောအကြွင်းကို ပြန်ပေးခြင်း။</li> <li><strong><code>top_k</code> Argument</strong>: <code>pipeline()</code> function တွင် အကောင်းဆုံးရလဒ် <code>k</code> ခုကို ပြန်ပေးရန် သတ်မှတ်သော argument။</li> <li><strong><code>return_offsets_mapping=True</code></strong>: Tokenizer ကို အသုံးပြုသောအခါ offset mapping အချက်အလက်များကို output တွင် ထည့်သွင်းရန် သတ်မှတ်ခြင်း။</li> <li><strong><code>return_overflowing_tokens=True</code></strong>: Tokenizer ကို အသုံးပြုသောအခါ max length ထက် ပိုနေသော tokens များကို သီးခြား sequence အဖြစ် ပြန်ပေးရန် သတ်မှတ်ခြင်း။</li> <li><strong><code>stride</code> Argument</strong>: <code>return_overflowing_tokens=True</code> နှင့်အတူ အသုံးပြုပြီး overlapping chunks များအတွက် tokens မည်မျှ ထပ်နေစေလိုသည်ကို သတ်မှတ်သည်။</li> <li><strong><code>truncation="only_second"</code></strong>: Tokenizer တွင် ဒုတိယ sentence ကိုသာ truncate လုပ်ရန် သတ်မှတ်ခြင်း။</li> <li><strong>Padding</strong>: sequence များကို တူညီသော အရှည်ဖြစ်စေရန် အတုအယောင် tokens များ (padding tokens) ထည့်သွင်းခြင်း။</li> <li><strong><code>padding="longest"</code></strong>: Batch အတွင်းရှိ အရှည်ဆုံး sequence ၏ အရှည်အထိ padding လုပ်ရန် သတ်မှတ်ခြင်း။</li> <li><strong><code>overflow_to_sample_mapping</code></strong>: Tokenizer မှ output အဖြစ် ပြန်ပေးသော map တစ်ခုဖြစ်ပြီး overflowing token sequence တစ်ခုစီသည် မူရင်း sample (input sentence) မည်သည့်နံပါတ်မှ လာသည်ကို ဖော်ပြသည်။</li></ul> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/course/blob/main/chapters/my/chapter6/3b.mdx" target="_blank"><svg class="mr-1" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M31,16l-7,7l-1.41-1.41L28.17,16l-5.58-5.59L24,9l7,7z"></path><path d="M1,16l7-7l1.41,1.41L3.83,16l5.58,5.59L8,23l-7-7z"></path><path d="M12.419,25.484L17.639,6.552l1.932,0.518L14.351,26.002z"></path></svg> <span data-svelte-h="svelte-zjs2n5"><span class="underline">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_tyugt6 = { | |
| assets: "/docs/course/pr_1114/my", | |
| base: "/docs/course/pr_1114/my", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/course/pr_1114/my/_app/immutable/entry/start.14794ee9.js"), | |
| import("/docs/course/pr_1114/my/_app/immutable/entry/app.a133f5c6.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 48], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 130 kB
- Xet hash:
- 89a1bdd8bbbab277cce46dae4683ccc83a17bf78d9e85f3d6703595dad9cfbf2
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.