Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"문서 질의 응답(Document Question Answering)","local":"document_question_answering","sections":[{"title":"데이터 불러오기","local":"load-the-data","sections":[],"depth":2},{"title":"데이터 전처리","local":"preprocess-the-data","sections":[{"title":"문서 이미지 전처리","local":"preprocessing-document-images","sections":[],"depth":3},{"title":"텍스트 데이터 전처리","local":"preprocessing-text-data","sections":[],"depth":3}],"depth":2},{"title":"평가","local":"evaluation","sections":[],"depth":2},{"title":"훈련","local":"train","sections":[],"depth":2},{"title":"추론","local":"inference","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/transformers/main/ko/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/entry/start.9aa88961.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/scheduler.9bc65507.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/singletons.9eec45c3.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/index.3b203c72.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/paths.566078f7.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/entry/app.84fb67c3.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/index.707bf1b6.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/nodes/0.1c99376b.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/nodes/63.de1521a8.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/Tip.c2ecdbf4.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/CodeBlock.54a9f38d.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/DocNotebookDropdown.41f65cb5.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/globals.7f7f1b26.js"> | |
| <link rel="modulepreload" href="/docs/transformers/main/ko/_app/immutable/chunks/EditOnGithub.922df6ba.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"문서 질의 응답(Document Question Answering)","local":"document_question_answering","sections":[{"title":"데이터 불러오기","local":"load-the-data","sections":[],"depth":2},{"title":"데이터 전처리","local":"preprocess-the-data","sections":[{"title":"문서 이미지 전처리","local":"preprocessing-document-images","sections":[],"depth":3},{"title":"텍스트 데이터 전처리","local":"preprocessing-text-data","sections":[],"depth":3}],"depth":2},{"title":"평가","local":"evaluation","sections":[],"depth":2},{"title":"훈련","local":"train","sections":[],"depth":2},{"title":"추론","local":"inference","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="document_question_answering" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#document_question_answering"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>문서 질의 응답(Document Question Answering)</span></h1> <div class="flex space-x-1 absolute z-10 right-0 top-0"> <div class="relative colab-dropdown "> <button class=" " type="button"> <img alt="Open In Colab" class="!m-0" src="https://colab.research.google.com/assets/colab-badge.svg"> </button> </div> <div class="relative colab-dropdown "> <button class=" " type="button"> <img alt="Open In Studio Lab" class="!m-0" src="https://studiolab.sagemaker.aws/studiolab.svg"> </button> </div></div> <p data-svelte-h="svelte-oqnn9h">문서 시각적 질의 응답(Document Visual Question Answering)이라고도 하는 | |
| 문서 질의 응답(Document Question Answering)은 문서 이미지에 대한 질문에 답변을 주는 태스크입니다. | |
| 이 태스크를 지원하는 모델의 입력은 일반적으로 이미지와 질문의 조합이고, 출력은 자연어로 된 답변입니다. 이러한 모델은 텍스트, 단어의 위치(바운딩 박스), 이미지 등 다양한 모달리티를 활용합니다.</p> <p data-svelte-h="svelte-k9bbb9">이 가이드는 다음 내용을 설명합니다:</p> <ul data-svelte-h="svelte-1t9y1pd"><li><a href="https://huggingface.co/datasets/nielsr/docvqa_1200_examples_donut" rel="nofollow">DocVQA dataset</a>을 사용해 <a href="../model_doc/layoutlmv2">LayoutLMv2</a> 미세 조정하기</li> <li>추론을 위해 미세 조정된 모델을 사용하기</li></ul> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-17cmma8">이 작업과 호환되는 모든 아키텍처와 체크포인트를 보려면 <a href="https://huggingface.co/tasks/image-to-text" rel="nofollow">작업 페이지</a>를 확인하는 것이 좋습니다.</p></div> <p data-svelte-h="svelte-1t45j8o">LayoutLMv2는 토큰의 마지막 은닉층 위에 질의 응답 헤드를 추가해 답변의 시작 토큰과 끝 토큰의 위치를 예측함으로써 문서 질의 응답 태스크를 해결합니다. 즉, 문맥이 주어졌을 때 질문에 답하는 정보를 추출하는 추출형 질의 응답(Extractive question answering)으로 문제를 처리합니다. | |
| 문맥은 OCR 엔진의 출력에서 가져오며, 여기서는 Google의 Tesseract를 사용합니다.</p> <p data-svelte-h="svelte-ddzvb0">시작하기 전에 필요한 라이브러리가 모두 설치되어 있는지 확인하세요. LayoutLMv2는 detectron2, torchvision 및 테서랙트를 필요로 합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->pip install -q transformers datasets<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->pip install <span class="hljs-string">'git+https://github.com/facebookresearch/detectron2.git'</span> | |
| pip install torchvision<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->sudo apt install tesseract-ocr | |
| pip install -q pytesseract<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1uvmu64">필요한 라이브러리들을 모두 설치한 후 런타임을 다시 시작합니다.</p> <p data-svelte-h="svelte-xyvcw8">커뮤니티에 당신의 모델을 공유하는 것을 권장합니다. Hugging Face 계정에 로그인해서 모델을 🤗 Hub에 업로드하세요. | |
| 프롬프트가 실행되면, 로그인을 위해 토큰을 입력하세요:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> huggingface_hub <span class="hljs-keyword">import</span> notebook_login | |
| <span class="hljs-meta">>>> </span>notebook_login()<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-13bifrs">몇 가지 전역 변수를 정의해 보겠습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>model_checkpoint = <span class="hljs-string">"microsoft/layoutlmv2-base-uncased"</span> | |
| <span class="hljs-meta">>>> </span>batch_size = <span class="hljs-number">4</span><!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="load-the-data" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#load-the-data"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>데이터 불러오기</span></h2> <p data-svelte-h="svelte-12y9xdp">이 가이드에서는 🤗 Hub에서 찾을 수 있는 전처리된 DocVQA의 작은 샘플을 사용합니다. | |
| DocVQA의 전체 데이터 세트를 사용하고 싶다면, <a href="https://rrc.cvc.uab.es/?ch=17" rel="nofollow">DocVQA homepage</a>에 가입 후 다운로드 할 수 있습니다. 전체 데이터 세트를 다운로드 했다면, 이 가이드를 계속 진행하기 위해 <a href="https://huggingface.co/docs/datasets/loading#local-and-remote-files" rel="nofollow">🤗 dataset에 파일을 가져오는 방법</a>을 확인하세요.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> datasets <span class="hljs-keyword">import</span> load_dataset | |
| <span class="hljs-meta">>>> </span>dataset = load_dataset(<span class="hljs-string">"nielsr/docvqa_1200_examples"</span>) | |
| <span class="hljs-meta">>>> </span>dataset | |
| DatasetDict({ | |
| train: Dataset({ | |
| features: [<span class="hljs-string">'id'</span>, <span class="hljs-string">'image'</span>, <span class="hljs-string">'query'</span>, <span class="hljs-string">'answers'</span>, <span class="hljs-string">'words'</span>, <span class="hljs-string">'bounding_boxes'</span>, <span class="hljs-string">'answer'</span>], | |
| num_rows: <span class="hljs-number">1000</span> | |
| }) | |
| test: Dataset({ | |
| features: [<span class="hljs-string">'id'</span>, <span class="hljs-string">'image'</span>, <span class="hljs-string">'query'</span>, <span class="hljs-string">'answers'</span>, <span class="hljs-string">'words'</span>, <span class="hljs-string">'bounding_boxes'</span>, <span class="hljs-string">'answer'</span>], | |
| num_rows: <span class="hljs-number">200</span> | |
| }) | |
| })<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-9rg4tz">보시다시피, 데이터 세트는 이미 훈련 세트와 테스트 세트로 나누어져 있습니다. 무작위로 예제를 살펴보면서 특성을 확인해보세요.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>dataset[<span class="hljs-string">"train"</span>].features<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-zq0ej4">각 필드가 나타내는 내용은 다음과 같습니다:</p> <ul data-svelte-h="svelte-g2ws24"><li><code>id</code>: 예제의 id</li> <li><code>image</code>: 문서 이미지를 포함하는 PIL.Image.Image 객체</li> <li><code>query</code>: 질문 문자열 - 여러 언어의 자연어로 된 질문</li> <li><code>answers</code>: 사람이 주석을 단 정답 리스트</li> <li><code>words</code> and <code>bounding_boxes</code>: OCR의 결과값들이며 이 가이드에서는 사용하지 않을 예정</li> <li><code>answer</code>: 다른 모델과 일치하는 답변이며 이 가이드에서는 사용하지 않을 예정</li></ul> <p data-svelte-h="svelte-xeapog">영어로 된 질문만 남기고 다른 모델에 대한 예측을 포함하는 <code>answer</code> 특성을 삭제하겠습니다. | |
| 그리고 주석 작성자가 제공한 데이터 세트에서 첫 번째 답변을 가져옵니다. 또는 무작위로 샘플을 추출할 수도 있습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>updated_dataset = dataset.<span class="hljs-built_in">map</span>(<span class="hljs-keyword">lambda</span> example: {<span class="hljs-string">"question"</span>: example[<span class="hljs-string">"query"</span>][<span class="hljs-string">"en"</span>]}, remove_columns=[<span class="hljs-string">"query"</span>]) | |
| <span class="hljs-meta">>>> </span>updated_dataset = updated_dataset.<span class="hljs-built_in">map</span>( | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">lambda</span> example: {<span class="hljs-string">"answer"</span>: example[<span class="hljs-string">"answers"</span>][<span class="hljs-number">0</span>]}, remove_columns=[<span class="hljs-string">"answer"</span>, <span class="hljs-string">"answers"</span>] | |
| <span class="hljs-meta">... </span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-m4zkx9">이 가이드에서 사용하는 LayoutLMv2 체크포인트는 <code>max_position_embeddings = 512</code>로 훈련되었습니다(이 정보는 <a href="https://huggingface.co/microsoft/layoutlmv2-base-uncased/blob/main/config.json#L18" rel="nofollow">체크포인트의 <code>config.json</code> 파일</a>에서 확인할 수 있습니다). | |
| 바로 예제를 잘라낼 수도 있지만, 긴 문서의 끝에 답변이 있어 잘리는 상황을 피하기 위해 여기서는 임베딩이 512보다 길어질 가능성이 있는 몇 가지 예제를 제거하겠습니다. | |
| 데이터 세트에 있는 대부분의 문서가 긴 경우 슬라이딩 윈도우 방법을 사용할 수 있습니다 - 자세한 내용을 확인하고 싶으면 이 <a href="https://github.com/huggingface/notebooks/blob/main/examples/question_answering.ipynb" rel="nofollow">노트북</a>을 확인하세요.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>updated_dataset = updated_dataset.<span class="hljs-built_in">filter</span>(<span class="hljs-keyword">lambda</span> x: <span class="hljs-built_in">len</span>(x[<span class="hljs-string">"words"</span>]) + <span class="hljs-built_in">len</span>(x[<span class="hljs-string">"question"</span>].split()) < <span class="hljs-number">512</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1xnig6m">이 시점에서 이 데이터 세트의 OCR 특성도 제거해 보겠습니다. OCR 특성은 다른 모델을 미세 조정하기 위한 것으로, 이 가이드에서 사용하는 모델의 입력 요구 사항과 일치하지 않기 때문에 이 특성을 사용하기 위해서는 일부 처리가 필요합니다. | |
| 대신, 원본 데이터에 <code>LayoutLMv2Processor</code>를 사용하여 OCR 및 토큰화를 모두 수행할 수 있습니다. | |
| 이렇게 하면 모델이 요구하는 입력을 얻을 수 있습니다. | |
| 이미지를 수동으로 처리하려면, <a href="../model_doc/layoutlmv2"><code>LayoutLMv2</code> model documentation</a>에서 모델이 요구하는 입력 포맷을 확인해보세요.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>updated_dataset = updated_dataset.remove_columns(<span class="hljs-string">"words"</span>) | |
| <span class="hljs-meta">>>> </span>updated_dataset = updated_dataset.remove_columns(<span class="hljs-string">"bounding_boxes"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-lp4iy7">마지막으로, 데이터 탐색을 완료하기 위해 이미지 예시를 살펴봅시다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>updated_dataset[<span class="hljs-string">"train"</span>][<span class="hljs-number">11</span>][<span class="hljs-string">"image"</span>]<!-- HTML_TAG_END --></pre></div> <div class="flex justify-center" data-svelte-h="svelte-q63tj1"><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/docvqa_example.jpg" alt="DocVQA Image Example"></div> <h2 class="relative group"><a id="preprocess-the-data" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#preprocess-the-data"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>데이터 전처리</span></h2> <p data-svelte-h="svelte-1y93l9d">문서 질의 응답 태스크는 멀티모달 태스크이며, 각 모달리티의 입력이 모델의 요구에 맞게 전처리 되었는지 확인해야 합니다. | |
| 이미지 데이터를 처리할 수 있는 이미지 프로세서와 텍스트 데이터를 인코딩할 수 있는 토크나이저를 결합한 <code>LayoutLMv2Processor</code>를 가져오는 것부터 시작해 보겠습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoProcessor | |
| <span class="hljs-meta">>>> </span>processor = AutoProcessor.from_pretrained(model_checkpoint)<!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="preprocessing-document-images" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#preprocessing-document-images"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>문서 이미지 전처리</span></h3> <p data-svelte-h="svelte-1kcsbqg">먼저, 프로세서의 <code>image_processor</code>를 사용해 모델에 대한 문서 이미지를 준비해 보겠습니다. | |
| 기본값으로, 이미지 프로세서는 이미지 크기를 224x224로 조정하고 색상 채널의 순서가 올바른지 확인한 후 단어와 정규화된 바운딩 박스를 얻기 위해 테서랙트를 사용해 OCR를 적용합니다. | |
| 이 튜토리얼에서 우리가 필요한 것과 기본값은 완전히 동일합니다. 이미지 배치에 기본 이미지 처리를 적용하고 OCR의 결과를 변환하는 함수를 작성합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>image_processor = processor.image_processor | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">def</span> <span class="hljs-title function_">get_ocr_words_and_boxes</span>(<span class="hljs-params">examples</span>): | |
| <span class="hljs-meta">... </span> images = [image.convert(<span class="hljs-string">"RGB"</span>) <span class="hljs-keyword">for</span> image <span class="hljs-keyword">in</span> examples[<span class="hljs-string">"image"</span>]] | |
| <span class="hljs-meta">... </span> encoded_inputs = image_processor(images) | |
| <span class="hljs-meta">... </span> examples[<span class="hljs-string">"image"</span>] = encoded_inputs.pixel_values | |
| <span class="hljs-meta">... </span> examples[<span class="hljs-string">"words"</span>] = encoded_inputs.words | |
| <span class="hljs-meta">... </span> examples[<span class="hljs-string">"boxes"</span>] = encoded_inputs.boxes | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">return</span> examples<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1rksgl2">이 전처리를 데이터 세트 전체에 빠르게 적용하려면 <code>map</code>를 사용하세요.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>dataset_with_ocr = updated_dataset.<span class="hljs-built_in">map</span>(get_ocr_words_and_boxes, batched=<span class="hljs-literal">True</span>, batch_size=<span class="hljs-number">2</span>)<!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="preprocessing-text-data" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#preprocessing-text-data"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>텍스트 데이터 전처리</span></h3> <p data-svelte-h="svelte-1l26czp">이미지에 OCR을 적용했으면 데이터 세트의 텍스트 부분을 모델에 맞게 인코딩해야 합니다. | |
| 이 인코딩에는 이전 단계에서 가져온 단어와 박스를 토큰 수준의 <code>input_ids</code>, <code>attention_mask</code>, <code>token_type_ids</code> 및 <code>bbox</code>로 변환하는 작업이 포함됩니다. | |
| 텍스트를 전처리하려면 프로세서의 <code>tokenizer</code>가 필요합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>tokenizer = processor.tokenizer<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1yiqpne">위에서 언급한 전처리 외에도 모델을 위해 레이블을 추가해야 합니다. 🤗 Transformers의 <code>xxxForQuestionAnswering</code> 모델의 경우, 레이블은 <code>start_positions</code>와 <code>end_positions</code>로 구성되며 어떤 토큰이 답변의 시작과 끝에 있는지를 나타냅니다.</p> <p data-svelte-h="svelte-rkm9nf">레이블 추가를 위해서, 먼저 더 큰 리스트(단어 리스트)에서 하위 리스트(단어로 분할된 답변)을 찾을 수 있는 헬퍼 함수를 정의합니다.</p> <p data-svelte-h="svelte-1pxhw5r">이 함수는 <code>words_list</code>와 <code>answer_list</code>, 이렇게 두 리스트를 입력으로 받습니다. | |
| 그런 다음 <code>words_list</code>를 반복하여 <code>words_list</code>의 현재 단어(words_list[i])가 <code>answer_list</code>의 첫 번째 단어(answer_list[0])와 같은지, | |
| 현재 단어에서 시작해 <code>answer_list</code>와 같은 길이만큼의 <code>words_list</code>의 하위 리스트가 <code>answer_list</code>와 일치하는지 확인합니다. | |
| 이 조건이 참이라면 일치하는 항목을 발견했음을 의미하며, 함수는 일치 항목, 시작 인덱스(idx) 및 종료 인덱스(idx + len(answer_list) - 1)를 기록합니다. 일치하는 항목이 두 개 이상 발견되면 함수는 첫 번째 항목만 반환합니다. 일치하는 항목이 없다면 함수는 (<code>None</code>, 0, 0)을 반환합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">def</span> <span class="hljs-title function_">subfinder</span>(<span class="hljs-params">words_list, answer_list</span>): | |
| <span class="hljs-meta">... </span> matches = [] | |
| <span class="hljs-meta">... </span> start_indices = [] | |
| <span class="hljs-meta">... </span> end_indices = [] | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> idx, i <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(<span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(words_list))): | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">if</span> words_list[i] == answer_list[<span class="hljs-number">0</span>] <span class="hljs-keyword">and</span> words_list[i : i + <span class="hljs-built_in">len</span>(answer_list)] == answer_list: | |
| <span class="hljs-meta">... </span> matches.append(answer_list) | |
| <span class="hljs-meta">... </span> start_indices.append(idx) | |
| <span class="hljs-meta">... </span> end_indices.append(idx + <span class="hljs-built_in">len</span>(answer_list) - <span class="hljs-number">1</span>) | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">if</span> matches: | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">return</span> matches[<span class="hljs-number">0</span>], start_indices[<span class="hljs-number">0</span>], end_indices[<span class="hljs-number">0</span>] | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">else</span>: | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-mdy9jm">이 함수가 어떻게 정답의 위치를 찾는지 설명하기 위해 다음 예제에서 함수를 사용해 보겠습니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>example = dataset_with_ocr[<span class="hljs-string">"train"</span>][<span class="hljs-number">1</span>] | |
| <span class="hljs-meta">>>> </span>words = [word.lower() <span class="hljs-keyword">for</span> word <span class="hljs-keyword">in</span> example[<span class="hljs-string">"words"</span>]] | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">match</span>, word_idx_start, word_idx_end = subfinder(words, example[<span class="hljs-string">"answer"</span>].lower().split()) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"Question: "</span>, example[<span class="hljs-string">"question"</span>]) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"Words:"</span>, words) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"Answer: "</span>, example[<span class="hljs-string">"answer"</span>]) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"start_index"</span>, word_idx_start) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(<span class="hljs-string">"end_index"</span>, word_idx_end) | |
| Question: Who <span class="hljs-keyword">is</span> <span class="hljs-keyword">in</span> cc <span class="hljs-keyword">in</span> this letter? | |
| Words: [<span class="hljs-string">'wie'</span>, <span class="hljs-string">'baw'</span>, <span class="hljs-string">'brown'</span>, <span class="hljs-string">'&'</span>, <span class="hljs-string">'williamson'</span>, <span class="hljs-string">'tobacco'</span>, <span class="hljs-string">'corporation'</span>, <span class="hljs-string">'research'</span>, <span class="hljs-string">'&'</span>, <span class="hljs-string">'development'</span>, <span class="hljs-string">'internal'</span>, <span class="hljs-string">'correspondence'</span>, <span class="hljs-string">'to:'</span>, <span class="hljs-string">'r.'</span>, <span class="hljs-string">'h.'</span>, <span class="hljs-string">'honeycutt'</span>, <span class="hljs-string">'ce:'</span>, <span class="hljs-string">'t.f.'</span>, <span class="hljs-string">'riehl'</span>, <span class="hljs-string">'from:'</span>, <span class="hljs-string">'.'</span>, <span class="hljs-string">'c.j.'</span>, <span class="hljs-string">'cook'</span>, <span class="hljs-string">'date:'</span>, <span class="hljs-string">'may'</span>, <span class="hljs-string">'8,'</span>, <span class="hljs-string">'1995'</span>, <span class="hljs-string">'subject:'</span>, <span class="hljs-string">'review'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'existing'</span>, <span class="hljs-string">'brainstorming'</span>, <span class="hljs-string">'ideas/483'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'major'</span>, <span class="hljs-string">'function'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'product'</span>, <span class="hljs-string">'innovation'</span>, <span class="hljs-string">'graup'</span>, <span class="hljs-string">'is'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'develop'</span>, <span class="hljs-string">'marketable'</span>, <span class="hljs-string">'nove!'</span>, <span class="hljs-string">'products'</span>, <span class="hljs-string">'that'</span>, <span class="hljs-string">'would'</span>, <span class="hljs-string">'be'</span>, <span class="hljs-string">'profitable'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'manufacture'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'sell.'</span>, <span class="hljs-string">'novel'</span>, <span class="hljs-string">'is'</span>, <span class="hljs-string">'defined'</span>, <span class="hljs-string">'as:'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'new'</span>, <span class="hljs-string">'kind,'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'different'</span>, <span class="hljs-string">'from'</span>, <span class="hljs-string">'anything'</span>, <span class="hljs-string">'seen'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'known'</span>, <span class="hljs-string">'before.'</span>, <span class="hljs-string">'innovation'</span>, <span class="hljs-string">'is'</span>, <span class="hljs-string">'defined'</span>, <span class="hljs-string">'as:'</span>, <span class="hljs-string">'something'</span>, <span class="hljs-string">'new'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'different'</span>, <span class="hljs-string">'introduced;'</span>, <span class="hljs-string">'act'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'innovating;'</span>, <span class="hljs-string">'introduction'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'new'</span>, <span class="hljs-string">'things'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'methods.'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'products'</span>, <span class="hljs-string">'may'</span>, <span class="hljs-string">'incorporate'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'latest'</span>, <span class="hljs-string">'technologies,'</span>, <span class="hljs-string">'materials'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'know-how'</span>, <span class="hljs-string">'available'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'give'</span>, <span class="hljs-string">'then'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'unique'</span>, <span class="hljs-string">'taste'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'look.'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'first'</span>, <span class="hljs-string">'task'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'product'</span>, <span class="hljs-string">'innovation'</span>, <span class="hljs-string">'group'</span>, <span class="hljs-string">'was'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'assemble,'</span>, <span class="hljs-string">'review'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'categorize'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'list'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'existing'</span>, <span class="hljs-string">'brainstorming'</span>, <span class="hljs-string">'ideas.'</span>, <span class="hljs-string">'ideas'</span>, <span class="hljs-string">'were'</span>, <span class="hljs-string">'grouped'</span>, <span class="hljs-string">'into'</span>, <span class="hljs-string">'two'</span>, <span class="hljs-string">'major'</span>, <span class="hljs-string">'categories'</span>, <span class="hljs-string">'labeled'</span>, <span class="hljs-string">'appearance'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'taste/aroma.'</span>, <span class="hljs-string">'these'</span>, <span class="hljs-string">'categories'</span>, <span class="hljs-string">'are'</span>, <span class="hljs-string">'used'</span>, <span class="hljs-string">'for'</span>, <span class="hljs-string">'novel'</span>, <span class="hljs-string">'products'</span>, <span class="hljs-string">'that'</span>, <span class="hljs-string">'may'</span>, <span class="hljs-string">'differ'</span>, <span class="hljs-string">'from'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'visual'</span>, <span class="hljs-string">'and/or'</span>, <span class="hljs-string">'taste/aroma'</span>, <span class="hljs-string">'point'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'view'</span>, <span class="hljs-string">'compared'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'canventional'</span>, <span class="hljs-string">'cigarettes.'</span>, <span class="hljs-string">'other'</span>, <span class="hljs-string">'categories'</span>, <span class="hljs-string">'include'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'combination'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'the'</span>, <span class="hljs-string">'above,'</span>, <span class="hljs-string">'filters,'</span>, <span class="hljs-string">'packaging'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'brand'</span>, <span class="hljs-string">'extensions.'</span>, <span class="hljs-string">'appearance'</span>, <span class="hljs-string">'this'</span>, <span class="hljs-string">'category'</span>, <span class="hljs-string">'is'</span>, <span class="hljs-string">'used'</span>, <span class="hljs-string">'for'</span>, <span class="hljs-string">'novel'</span>, <span class="hljs-string">'cigarette'</span>, <span class="hljs-string">'constructions'</span>, <span class="hljs-string">'that'</span>, <span class="hljs-string">'yield'</span>, <span class="hljs-string">'visually'</span>, <span class="hljs-string">'different'</span>, <span class="hljs-string">'products'</span>, <span class="hljs-string">'with'</span>, <span class="hljs-string">'minimal'</span>, <span class="hljs-string">'changes'</span>, <span class="hljs-string">'in'</span>, <span class="hljs-string">'smoke'</span>, <span class="hljs-string">'chemistry'</span>, <span class="hljs-string">'two'</span>, <span class="hljs-string">'cigarettes'</span>, <span class="hljs-string">'in'</span>, <span class="hljs-string">'cne.'</span>, <span class="hljs-string">'emulti-plug'</span>, <span class="hljs-string">'te'</span>, <span class="hljs-string">'build'</span>, <span class="hljs-string">'yaur'</span>, <span class="hljs-string">'awn'</span>, <span class="hljs-string">'cigarette.'</span>, <span class="hljs-string">'eswitchable'</span>, <span class="hljs-string">'menthol'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'non'</span>, <span class="hljs-string">'menthol'</span>, <span class="hljs-string">'cigarette.'</span>, <span class="hljs-string">'*cigarettes'</span>, <span class="hljs-string">'with'</span>, <span class="hljs-string">'interspaced'</span>, <span class="hljs-string">'perforations'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'enable'</span>, <span class="hljs-string">'smoker'</span>, <span class="hljs-string">'to'</span>, <span class="hljs-string">'separate'</span>, <span class="hljs-string">'unburned'</span>, <span class="hljs-string">'section'</span>, <span class="hljs-string">'for'</span>, <span class="hljs-string">'future'</span>, <span class="hljs-string">'smoking.'</span>, <span class="hljs-string">'«short'</span>, <span class="hljs-string">'cigarette,'</span>, <span class="hljs-string">'tobacco'</span>, <span class="hljs-string">'section'</span>, <span class="hljs-string">'30'</span>, <span class="hljs-string">'mm.'</span>, <span class="hljs-string">'«extremely'</span>, <span class="hljs-string">'fast'</span>, <span class="hljs-string">'buming'</span>, <span class="hljs-string">'cigarette.'</span>, <span class="hljs-string">'«novel'</span>, <span class="hljs-string">'cigarette'</span>, <span class="hljs-string">'constructions'</span>, <span class="hljs-string">'that'</span>, <span class="hljs-string">'permit'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'significant'</span>, <span class="hljs-string">'reduction'</span>, <span class="hljs-string">'iretobacco'</span>, <span class="hljs-string">'weight'</span>, <span class="hljs-string">'while'</span>, <span class="hljs-string">'maintaining'</span>, <span class="hljs-string">'smoking'</span>, <span class="hljs-string">'mechanics'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'visual'</span>, <span class="hljs-string">'characteristics.'</span>, <span class="hljs-string">'higher'</span>, <span class="hljs-string">'basis'</span>, <span class="hljs-string">'weight'</span>, <span class="hljs-string">'paper:'</span>, <span class="hljs-string">'potential'</span>, <span class="hljs-string">'reduction'</span>, <span class="hljs-string">'in'</span>, <span class="hljs-string">'tobacco'</span>, <span class="hljs-string">'weight.'</span>, <span class="hljs-string">'«more'</span>, <span class="hljs-string">'rigid'</span>, <span class="hljs-string">'tobacco'</span>, <span class="hljs-string">'column;'</span>, <span class="hljs-string">'stiffing'</span>, <span class="hljs-string">'agent'</span>, <span class="hljs-string">'for'</span>, <span class="hljs-string">'tobacco;'</span>, <span class="hljs-string">'e.g.'</span>, <span class="hljs-string">'starch'</span>, <span class="hljs-string">'*colored'</span>, <span class="hljs-string">'tow'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'cigarette'</span>, <span class="hljs-string">'papers;'</span>, <span class="hljs-string">'seasonal'</span>, <span class="hljs-string">'promotions,'</span>, <span class="hljs-string">'e.g.'</span>, <span class="hljs-string">'pastel'</span>, <span class="hljs-string">'colored'</span>, <span class="hljs-string">'cigarettes'</span>, <span class="hljs-string">'for'</span>, <span class="hljs-string">'easter'</span>, <span class="hljs-string">'or'</span>, <span class="hljs-string">'in'</span>, <span class="hljs-string">'an'</span>, <span class="hljs-string">'ebony'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'ivory'</span>, <span class="hljs-string">'brand'</span>, <span class="hljs-string">'containing'</span>, <span class="hljs-string">'a'</span>, <span class="hljs-string">'mixture'</span>, <span class="hljs-string">'of'</span>, <span class="hljs-string">'all'</span>, <span class="hljs-string">'black'</span>, <span class="hljs-string">'(black'</span>, <span class="hljs-string">'paper'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'tow)'</span>, <span class="hljs-string">'and'</span>, <span class="hljs-string">'ail'</span>, <span class="hljs-string">'white'</span>, <span class="hljs-string">'cigarettes.'</span>, <span class="hljs-string">'499150498'</span>] | |
| Answer: T.F. Riehl | |
| start_index <span class="hljs-number">17</span> | |
| end_index <span class="hljs-number">18</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1gff4qz">한편, 위 예제가 인코딩되면 다음과 같이 표시됩니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>encoding = tokenizer(example[<span class="hljs-string">"question"</span>], example[<span class="hljs-string">"words"</span>], example[<span class="hljs-string">"boxes"</span>]) | |
| <span class="hljs-meta">>>> </span>tokenizer.decode(encoding[<span class="hljs-string">"input_ids"</span>]) | |
| [CLS] who <span class="hljs-keyword">is</span> <span class="hljs-keyword">in</span> cc <span class="hljs-keyword">in</span> this letter? [SEP] wie baw brown & williamson tobacco corporation research & development ...<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-zzn8kk">이제 인코딩된 입력에서 정답의 위치를 찾아야 합니다.</p> <ul data-svelte-h="svelte-f0s3pn"><li><code>token_type_ids</code>는 어떤 토큰이 질문에 속하는지, 그리고 어떤 토큰이 문서의 단어에 포함되는지를 알려줍니다.</li> <li><code>tokenizer.cls_token_id</code> 입력의 시작 부분에 있는 특수 토큰을 찾는 데 도움을 줍니다.</li> <li><code>word_ids</code>는 원본 <code>words</code>에서 찾은 답변을 전체 인코딩된 입력의 동일한 답과 일치시키고 인코딩된 입력에서 답변의 시작/끝 위치를 결정합니다.</li></ul> <p data-svelte-h="svelte-1h1oq5v">위 내용들을 염두에 두고 데이터 세트 예제의 배치를 인코딩하는 함수를 만들어 보겠습니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">def</span> <span class="hljs-title function_">encode_dataset</span>(<span class="hljs-params">examples, max_length=<span class="hljs-number">512</span></span>): | |
| <span class="hljs-meta">... </span> questions = examples[<span class="hljs-string">"question"</span>] | |
| <span class="hljs-meta">... </span> words = examples[<span class="hljs-string">"words"</span>] | |
| <span class="hljs-meta">... </span> boxes = examples[<span class="hljs-string">"boxes"</span>] | |
| <span class="hljs-meta">... </span> answers = examples[<span class="hljs-string">"answer"</span>] | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># 예제 배치를 인코딩하고 start_positions와 end_positions를 초기화합니다</span> | |
| <span class="hljs-meta">... </span> encoding = tokenizer(questions, words, boxes, max_length=max_length, padding=<span class="hljs-string">"max_length"</span>, truncation=<span class="hljs-literal">True</span>) | |
| <span class="hljs-meta">... </span> start_positions = [] | |
| <span class="hljs-meta">... </span> end_positions = [] | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># 배치의 예제를 반복합니다</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(questions)): | |
| <span class="hljs-meta">... </span> cls_index = encoding[<span class="hljs-string">"input_ids"</span>][i].index(tokenizer.cls_token_id) | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># 예제의 words에서 답변의 위치를 찾습니다</span> | |
| <span class="hljs-meta">... </span> words_example = [word.lower() <span class="hljs-keyword">for</span> word <span class="hljs-keyword">in</span> words[i]] | |
| <span class="hljs-meta">... </span> answer = answers[i] | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">match</span>, word_idx_start, word_idx_end = subfinder(words_example, answer.lower().split()) | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">if</span> <span class="hljs-keyword">match</span>: | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># 일치하는 항목을 발견하면, `token_type_ids`를 사용해 인코딩에서 단어가 시작하는 위치를 찾습니다</span> | |
| <span class="hljs-meta">... </span> token_type_ids = encoding[<span class="hljs-string">"token_type_ids"</span>][i] | |
| <span class="hljs-meta">... </span> token_start_index = <span class="hljs-number">0</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">while</span> token_type_ids[token_start_index] != <span class="hljs-number">1</span>: | |
| <span class="hljs-meta">... </span> token_start_index += <span class="hljs-number">1</span> | |
| <span class="hljs-meta">... </span> token_end_index = <span class="hljs-built_in">len</span>(encoding[<span class="hljs-string">"input_ids"</span>][i]) - <span class="hljs-number">1</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">while</span> token_type_ids[token_end_index] != <span class="hljs-number">1</span>: | |
| <span class="hljs-meta">... </span> token_end_index -= <span class="hljs-number">1</span> | |
| <span class="hljs-meta">... </span> word_ids = encoding.word_ids(i)[token_start_index : token_end_index + <span class="hljs-number">1</span>] | |
| <span class="hljs-meta">... </span> start_position = cls_index | |
| <span class="hljs-meta">... </span> end_position = cls_index | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># words의 답변 위치와 일치할 때까지 word_ids를 반복하고 `token_start_index`를 늘립니다</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># 일치하면 `token_start_index`를 인코딩에서 답변의 `start_position`으로 저장합니다</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> <span class="hljs-built_in">id</span> <span class="hljs-keyword">in</span> word_ids: | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">if</span> <span class="hljs-built_in">id</span> == word_idx_start: | |
| <span class="hljs-meta">... </span> start_position = token_start_index | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">else</span>: | |
| <span class="hljs-meta">... </span> token_start_index += <span class="hljs-number">1</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-comment"># 비슷하게, 끝에서 시작해 `word_ids`를 반복하며 답변의 `end_position`을 찾습니다</span> | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">for</span> <span class="hljs-built_in">id</span> <span class="hljs-keyword">in</span> word_ids[::-<span class="hljs-number">1</span>]: | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">if</span> <span class="hljs-built_in">id</span> == word_idx_end: | |
| <span class="hljs-meta">... </span> end_position = token_end_index | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">else</span>: | |
| <span class="hljs-meta">... </span> token_end_index -= <span class="hljs-number">1</span> | |
| <span class="hljs-meta">... </span> start_positions.append(start_position) | |
| <span class="hljs-meta">... </span> end_positions.append(end_position) | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">else</span>: | |
| <span class="hljs-meta">... </span> start_positions.append(cls_index) | |
| <span class="hljs-meta">... </span> end_positions.append(cls_index) | |
| <span class="hljs-meta">... </span> encoding[<span class="hljs-string">"image"</span>] = examples[<span class="hljs-string">"image"</span>] | |
| <span class="hljs-meta">... </span> encoding[<span class="hljs-string">"start_positions"</span>] = start_positions | |
| <span class="hljs-meta">... </span> encoding[<span class="hljs-string">"end_positions"</span>] = end_positions | |
| <span class="hljs-meta">... </span> <span class="hljs-keyword">return</span> encoding<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-10ovv0a">이제 이 전처리 함수가 있으니 전체 데이터 세트를 인코딩할 수 있습니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>encoded_train_dataset = dataset_with_ocr[<span class="hljs-string">"train"</span>].<span class="hljs-built_in">map</span>( | |
| <span class="hljs-meta">... </span> encode_dataset, batched=<span class="hljs-literal">True</span>, batch_size=<span class="hljs-number">2</span>, remove_columns=dataset_with_ocr[<span class="hljs-string">"train"</span>].column_names | |
| <span class="hljs-meta">... </span>) | |
| <span class="hljs-meta">>>> </span>encoded_test_dataset = dataset_with_ocr[<span class="hljs-string">"test"</span>].<span class="hljs-built_in">map</span>( | |
| <span class="hljs-meta">... </span> encode_dataset, batched=<span class="hljs-literal">True</span>, batch_size=<span class="hljs-number">2</span>, remove_columns=dataset_with_ocr[<span class="hljs-string">"test"</span>].column_names | |
| <span class="hljs-meta">... </span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1cmvv07">인코딩된 데이터 세트의 특성이 어떻게 생겼는지 확인해 보겠습니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>encoded_train_dataset.features | |
| {<span class="hljs-string">'image'</span>: <span class="hljs-type">Sequence</span>(feature=<span class="hljs-type">Sequence</span>(feature=<span class="hljs-type">Sequence</span>(feature=Value(dtype=<span class="hljs-string">'uint8'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), | |
| <span class="hljs-string">'input_ids'</span>: <span class="hljs-type">Sequence</span>(feature=Value(dtype=<span class="hljs-string">'int32'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), | |
| <span class="hljs-string">'token_type_ids'</span>: <span class="hljs-type">Sequence</span>(feature=Value(dtype=<span class="hljs-string">'int8'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), | |
| <span class="hljs-string">'attention_mask'</span>: <span class="hljs-type">Sequence</span>(feature=Value(dtype=<span class="hljs-string">'int8'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), | |
| <span class="hljs-string">'bbox'</span>: <span class="hljs-type">Sequence</span>(feature=<span class="hljs-type">Sequence</span>(feature=Value(dtype=<span class="hljs-string">'int64'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), length=-<span class="hljs-number">1</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), | |
| <span class="hljs-string">'start_positions'</span>: Value(dtype=<span class="hljs-string">'int64'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>), | |
| <span class="hljs-string">'end_positions'</span>: Value(dtype=<span class="hljs-string">'int64'</span>, <span class="hljs-built_in">id</span>=<span class="hljs-literal">None</span>)}<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="evaluation" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#evaluation"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>평가</span></h2> <p data-svelte-h="svelte-8iuplh">문서 질의 응답을 평가하려면 상당한 양의 후처리가 필요합니다. 시간이 너무 많이 걸리지 않도록 이 가이드에서는 평가 단계를 생략합니다. | |
| <code>Trainer</code>가 훈련 과정에서 평가 손실(evaluation loss)을 계속 계산하기 때문에 모델의 성능을 대략적으로 알 수 있습니다. | |
| 추출적(Extractive) 질의 응답은 보통 F1/exact match 방법을 사용해 평가됩니다. | |
| 직접 구현해보고 싶으시다면, Hugging Face course의 <a href="https://huggingface.co/course/chapter7/7?fw=pt#postprocessing" rel="nofollow">Question Answering chapter</a>을 참고하세요.</p> <h2 class="relative group"><a id="train" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#train"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>훈련</span></h2> <p data-svelte-h="svelte-1r96dak">축하합니다! 이 가이드의 가장 어려운 부분을 성공적으로 처리했으니 이제 나만의 모델을 훈련할 준비가 되었습니다. | |
| 훈련은 다음과 같은 단계로 이루어져 있습니다:</p> <ul data-svelte-h="svelte-dhr9ud"><li>전처리에서의 동일한 체크포인트를 사용하기 위해 <code>AutoModelForDocumentQuestionAnswering</code>으로 모델을 가져옵니다.</li> <li><code>TrainingArguments</code>로 훈련 하이퍼파라미터를 정합니다.</li> <li>예제를 배치 처리하는 함수를 정의합니다. 여기서는 <code>DefaultDataCollator</code>가 적당합니다.</li> <li>모델, 데이터 세트, 데이터 콜레이터(Data collator)와 함께 <code>Trainer</code>에 훈련 인수들을 전달합니다.</li> <li><code>train()</code>을 호출해서 모델을 미세 조정합니다.</li></ul> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoModelForDocumentQuestionAnswering | |
| <span class="hljs-meta">>>> </span>model = AutoModelForDocumentQuestionAnswering.from_pretrained(model_checkpoint)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1wyv6ig"><code>TrainingArguments</code>에서 <code>output_dir</code>을 사용하여 모델을 저장할 위치를 지정하고, 적절한 하이퍼파라미터를 설정합니다. | |
| 모델을 커뮤니티와 공유하려면 <code>push_to_hub</code>를 <code>True</code>로 설정하세요 (모델을 업로드하려면 Hugging Face에 로그인해야 합니다). | |
| 이 경우 <code>output_dir</code>은 모델의 체크포인트를 푸시할 레포지토리의 이름이 됩니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> TrainingArguments | |
| <span class="hljs-meta">>>> </span><span class="hljs-comment"># 본인의 레포지토리 ID로 바꾸세요</span> | |
| <span class="hljs-meta">>>> </span>repo_id = <span class="hljs-string">"MariaK/layoutlmv2-base-uncased_finetuned_docvqa"</span> | |
| <span class="hljs-meta">>>> </span>training_args = TrainingArguments( | |
| <span class="hljs-meta">... </span> output_dir=repo_id, | |
| <span class="hljs-meta">... </span> per_device_train_batch_size=<span class="hljs-number">4</span>, | |
| <span class="hljs-meta">... </span> num_train_epochs=<span class="hljs-number">20</span>, | |
| <span class="hljs-meta">... </span> save_steps=<span class="hljs-number">200</span>, | |
| <span class="hljs-meta">... </span> logging_steps=<span class="hljs-number">50</span>, | |
| <span class="hljs-meta">... </span> eval_strategy=<span class="hljs-string">"steps"</span>, | |
| <span class="hljs-meta">... </span> learning_rate=<span class="hljs-number">5e-5</span>, | |
| <span class="hljs-meta">... </span> save_total_limit=<span class="hljs-number">2</span>, | |
| <span class="hljs-meta">... </span> remove_unused_columns=<span class="hljs-literal">False</span>, | |
| <span class="hljs-meta">... </span> push_to_hub=<span class="hljs-literal">True</span>, | |
| <span class="hljs-meta">... </span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1hdarm6">간단한 데이터 콜레이터를 정의하여 예제를 함께 배치합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> DefaultDataCollator | |
| <span class="hljs-meta">>>> </span>data_collator = DefaultDataCollator()<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1heolva">마지막으로, 모든 것을 한 곳에 모아 <code>train()</code>을 호출합니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> Trainer | |
| <span class="hljs-meta">>>> </span>trainer = Trainer( | |
| <span class="hljs-meta">... </span> model=model, | |
| <span class="hljs-meta">... </span> args=training_args, | |
| <span class="hljs-meta">... </span> data_collator=data_collator, | |
| <span class="hljs-meta">... </span> train_dataset=encoded_train_dataset, | |
| <span class="hljs-meta">... </span> eval_dataset=encoded_test_dataset, | |
| <span class="hljs-meta">... </span> tokenizer=processor, | |
| <span class="hljs-meta">... </span>) | |
| <span class="hljs-meta">>>> </span>trainer.train()<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1gum9w7">최종 모델을 🤗 Hub에 추가하려면, 모델 카드를 생성하고 <code>push_to_hub</code>를 호출합니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>trainer.create_model_card() | |
| <span class="hljs-meta">>>> </span>trainer.push_to_hub()<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="inference" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#inference"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>추론</span></h2> <p data-svelte-h="svelte-1flysn8">이제 LayoutLMv2 모델을 미세 조정하고 🤗 Hub에 업로드했으니 추론에도 사용할 수 있습니다. | |
| 추론을 위해 미세 조정된 모델을 사용해 보는 가장 간단한 방법은 <code>Pipeline</code>을 사용하는 것 입니다.</p> <p data-svelte-h="svelte-1dcjn75">예를 들어 보겠습니다:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>example = dataset[<span class="hljs-string">"test"</span>][<span class="hljs-number">2</span>] | |
| <span class="hljs-meta">>>> </span>question = example[<span class="hljs-string">"query"</span>][<span class="hljs-string">"en"</span>] | |
| <span class="hljs-meta">>>> </span>image = example[<span class="hljs-string">"image"</span>] | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(question) | |
| <span class="hljs-meta">>>> </span><span class="hljs-built_in">print</span>(example[<span class="hljs-string">"answers"</span>]) | |
| <span class="hljs-string">'Who is ‘presiding’ TRRF GENERAL SESSION (PART 1)?'</span> | |
| [<span class="hljs-string">'TRRF Vice President'</span>, <span class="hljs-string">'lee a. waller'</span>]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-gaec9h">그 다음, 모델로 문서 질의 응답을 하기 위해 파이프라인을 인스턴스화하고 이미지 + 질문 조합을 전달합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> pipeline | |
| <span class="hljs-meta">>>> </span>qa_pipeline = pipeline(<span class="hljs-string">"document-question-answering"</span>, model=<span class="hljs-string">"MariaK/layoutlmv2-base-uncased_finetuned_docvqa"</span>) | |
| <span class="hljs-meta">>>> </span>qa_pipeline(image, question) | |
| [{<span class="hljs-string">'score'</span>: <span class="hljs-number">0.9949808120727539</span>, | |
| <span class="hljs-string">'answer'</span>: <span class="hljs-string">'Lee A. Waller'</span>, | |
| <span class="hljs-string">'start'</span>: <span class="hljs-number">55</span>, | |
| <span class="hljs-string">'end'</span>: <span class="hljs-number">57</span>}]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-4epvs7">원한다면 파이프라인의 결과를 수동으로 복제할 수도 있습니다:</p> <ol data-svelte-h="svelte-10zqnq5"><li>이미지와 질문을 가져와 모델의 프로세서를 사용해 모델에 맞게 준비합니다.</li> <li>모델을 통해 결과 또는 전처리를 전달합니다.</li> <li>모델은 어떤 토큰이 답변의 시작에 있는지, 어떤 토큰이 답변이 끝에 있는지를 나타내는 <code>start_logits</code>와 <code>end_logits</code>를 반환합니다. 둘 다 (batch_size, sequence_length) 형태를 갖습니다.</li> <li><code>start_logits</code>와 <code>end_logits</code>의 마지막 차원을 최대로 만드는 값을 찾아 예상 <code>start_idx</code>와 <code>end_idx</code>를 얻습니다.</li> <li>토크나이저로 답변을 디코딩합니다.</li></ol> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">import</span> torch | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoProcessor | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoModelForDocumentQuestionAnswering | |
| <span class="hljs-meta">>>> </span>processor = AutoProcessor.from_pretrained(<span class="hljs-string">"MariaK/layoutlmv2-base-uncased_finetuned_docvqa"</span>) | |
| <span class="hljs-meta">>>> </span>model = AutoModelForDocumentQuestionAnswering.from_pretrained(<span class="hljs-string">"MariaK/layoutlmv2-base-uncased_finetuned_docvqa"</span>) | |
| <span class="hljs-meta">>>> </span><span class="hljs-keyword">with</span> torch.no_grad(): | |
| <span class="hljs-meta">... </span> encoding = processor(image.convert(<span class="hljs-string">"RGB"</span>), question, return_tensors=<span class="hljs-string">"pt"</span>) | |
| <span class="hljs-meta">... </span> outputs = model(**encoding) | |
| <span class="hljs-meta">... </span> start_logits = outputs.start_logits | |
| <span class="hljs-meta">... </span> end_logits = outputs.end_logits | |
| <span class="hljs-meta">... </span> predicted_start_idx = start_logits.argmax(-<span class="hljs-number">1</span>).item() | |
| <span class="hljs-meta">... </span> predicted_end_idx = end_logits.argmax(-<span class="hljs-number">1</span>).item() | |
| <span class="hljs-meta">>>> </span>processor.tokenizer.decode(encoding.input_ids.squeeze()[predicted_start_idx : predicted_end_idx + <span class="hljs-number">1</span>]) | |
| <span class="hljs-string">'lee a. waller'</span><!-- HTML_TAG_END --></pre></div> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/transformers/blob/main/docs/source/ko/tasks/document_question_answering.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_1hrx8 = { | |
| assets: "/docs/transformers/main/ko", | |
| base: "/docs/transformers/main/ko", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/transformers/main/ko/_app/immutable/entry/start.9aa88961.js"), | |
| import("/docs/transformers/main/ko/_app/immutable/entry/app.84fb67c3.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 63], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 108 kB
- Xet hash:
- b4a4a6bd1b15a7981679bb33e5481e814e3e5df6b942f2ae8342e113f3ab0eab
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.