AbdulElahGwaith's picture
Upload folder using huggingface_hub
a9bd396 verified

ํŒŒ์ดํ”„๋ผ์ธ [[pipelines]]

ํŒŒ์ดํ”„๋ผ์ธ์€ ๋ชจ๋ธ์„ ์ถ”๋ก ์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ›Œ๋ฅญํ•˜๊ณ  ์‰ฌ์šด ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์ด ํŒŒ์ดํ”„๋ผ์ธ์€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ๋ณต์žกํ•œ ์ฝ”๋“œ๋ฅผ ๋Œ€๋ถ€๋ถ„ ์ถ”์ƒํ™”ํ•˜์—ฌ, ๊ฐœ์ฒด๋ช… ์ธ์‹(Named Entity Recognition), ๋งˆ์Šคํฌ๋“œ ์–ธ์–ด ๋ชจ๋ธ๋ง(Masked Language Modeling), ๊ฐ์ • ๋ถ„์„(Sentiment Analysis), ํŠน์„ฑ ์ถ”์ถœ(Feature Extraction), ์งˆ์˜์‘๋‹ต(Question Answering) ๋“ฑ์˜ ์—ฌ๋Ÿฌ ์ž‘์—…์— ํŠนํ™”๋œ ๊ฐ„๋‹จํ•œ API๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ ์˜ˆ์‹œ๋Š” ์ž‘์—… ์š”์•ฝ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”.

ํŒŒ์ดํ”„๋ผ์ธ ์ถ”์ƒํ™”๋Š” ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ๋ฒ”์ฃผ๋กœ ๋‚˜๋‰ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ์ถ”์ƒํ™” [[the-pipeline-abstraction]]

ํŒŒ์ดํ”„๋ผ์ธ ์ถ”์ƒํ™”๋Š” ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ฐ์‹ธ๋Š” ๋ž˜ํผ์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํŒŒ์ดํ”„๋ผ์ธ์ฒ˜๋Ÿผ ์ธ์Šคํ„ด์Šคํ™”๋˜๋ฉฐ, ์ถ”๊ฐ€์ ์ธ ํŽธ์˜ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋‹จ์ผ ํ•ญ๋ชฉ ํ˜ธ์ถœ ์˜ˆ์‹œ:

>>> pipe = pipeline("text-classification")
>>> pipe("This restaurant is awesome")
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]

hub์—์„œ ํŠน์ • ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ, ํ•ด๋‹น ๋ชจ๋ธ์ด ์ด๋ฏธ ํ—ˆ๋ธŒ์— ์ž‘์—…์„ ์ •์˜ํ•˜๊ณ  ์žˆ๋‹ค๋ฉด ์ž‘์—…๋ช…์„ ์ƒ๋žตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

>>> pipe = pipeline(model="FacebookAI/roberta-large-mnli")
>>> pipe("This restaurant is awesome")
[{'label': 'NEUTRAL', 'score': 0.7313136458396912}]

์—ฌ๋Ÿฌ ํ•ญ๋ชฉ์„ ์ฒ˜๋ฆฌํ•˜๋ ค๋ฉด ๋ฆฌ์ŠคํŠธ๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”.

>>> pipe = pipeline("text-classification")
>>> pipe(["This restaurant is awesome", "This restaurant is awful"] )
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
 {'label': 'NEGATIVE', 'score': 0.9996669292449951}]

์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์„ ์ˆœํšŒํ•˜๋ ค๋ฉด dataset์„ ์ง์ ‘ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ ๋ฒˆ์— ๋ฉ”๋ชจ๋ฆฌ์— ์˜ฌ๋ฆด ํ•„์š”๋„ ์—†๊ณ , ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ๋ฅผ ๋”ฐ๋กœ ๊ตฌํ˜„ํ•˜์ง€ ์•Š์•„๋„ ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฉ์‹์€ GPU์—์„œ ์‚ฌ์šฉ์ž ์ •์˜ ๋ฃจํ”„์™€ ์œ ์‚ฌํ•œ ์†๋„๋กœ ์ž‘๋™ํ•˜๋ฉฐ, ๋งŒ์•ฝ ๊ทธ๋ ‡์ง€ ์•Š์„ ๊ฒฝ์šฐ ์ด์Šˆ๋ฅผ ๋“ฑ๋กํ•ด ์ฃผ์„ธ์š”.

import datasets
from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
from tqdm.auto import tqdm

pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
dataset = datasets.load_dataset("superb", name="asr", split="test")

# KeyDataset (*pt* ์ „์šฉ)๋Š” ๋ฐ์ดํ„ฐ์…‹ ํ•ญ๋ชฉ์˜ ๋”•์…”๋„ˆ๋ฆฌ์—์„œ ์ง€์ •๋œ ํ‚ค๋งŒ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
# ์ด ์˜ˆ์ œ์—์„œ๋Š” *target* ํ•ญ๋ชฉ์ด ํ•„์š”ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ KeyDataset์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋ฌธ์žฅ ์Œ ์ž…๋ ฅ์—๋Š” KeyPairDataset์„ ์‚ฌ์šฉํ•˜์„ธ์š”.
for out in tqdm(pipe(KeyDataset(dataset, "file"))):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

๋” ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์ œ๋„ˆ๋ ˆ์ดํ„ฐ๋„ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

from transformers import pipeline

pipe = pipeline("text-classification")

def data():
    while True:
        # ๋ฐ์ดํ„ฐ๋Š” ๋ฐ์ดํ„ฐ์…‹, ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค, ํ ๋˜๋Š” HTTP ์š”์ฒญ์—์„œ ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
        # ์„œ๋ฒ„์—์„œ
        # ์ฃผ์˜: ๋ฐ˜๋ณต์ ์ด๋ฏ€๋กœ `num_workers > 1` ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
        # ๋ฐ์ดํ„ฐ๋ฅผ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ์Šค๋ ˆ๋“œ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ
        # ๋ฉ”์ธ ์Šค๋ ˆ๋“œ๊ฐ€ ๋Œ€๊ทœ๋ชจ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋™์•ˆ ํ•˜๋‚˜์˜ ์Šค๋ ˆ๋“œ๊ฐ€ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
        yield "This is a test"

for out in pipe(data()):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

[[autodoc]] pipeline

ํŒŒ์ดํ”„๋ผ์ธ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ [[pipeline-batching]]

๋ชจ๋“  ํŒŒ์ดํ”„๋ผ์ธ์€ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ๋ฆฌ์ŠคํŠธ, Dataset, Generator ์ „๋‹ฌ ์‹œ ์ŠคํŠธ๋ฆฌ๋ฐ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•  ๋•Œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
import datasets

dataset = datasets.load_dataset("imdb", name="plain_text", split="unsupervised")
pipe = pipeline("text-classification", device=0)
for out in pipe(KeyDataset(dataset, "text"), batch_size=8, truncation="only_first"):
    print(out)
    # [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
    # ์ด์ „๊ณผ ๋™์ผํ•œ ์ถœ๋ ฅ์ด์ง€๋งŒ, ๋‚ด์šฉ์„ ๋ฐฐ์น˜๋กœ ๋ชจ๋ธ์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ๊ฐ€ ํ•ญ์ƒ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ณด์žฅํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ํ•˜๋“œ์›จ์–ด, ๋ฐ์ดํ„ฐ, ๋ชจ๋ธ์— ๋”ฐ๋ผ ์†๋„๊ฐ€ 10๋ฐฐ๋กœ ๋นจ๋ผ์งˆ์ˆ˜๋„, 5๋ฐฐ ๋А๋ ค์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฃผ๋กœ ์†๋„ ํ–ฅ์ƒ์ด ์žˆ๋Š” ์˜ˆ์‹œ:

from transformers import pipeline
from torch.utils.data import Dataset
from tqdm.auto import tqdm

pipe = pipeline("text-classification", device=0)

class MyDataset(Dataset):
    def __len__(self):
        return 5000

    def __getitem__(self, i):
        return "This is a test"

dataset = MyDataset()

for batch_size in [1, 8, 64, 256]:
    print("-" * 30)
    print(f"Streaming batch_size={batch_size}")
    for out in tqdm(pipe(dataset, batch_size=batch_size), total=len(dataset)):
        pass
# On GTX 970
------------------------------
Streaming no batching
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 5000/5000 [00:26<00:00, 187.52it/s]
------------------------------
Streaming batch_size=8
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 5000/5000 [00:04<00:00, 1205.95it/s]
------------------------------
Streaming batch_size=64
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 5000/5000 [00:02<00:00, 2478.24it/s]
------------------------------
Streaming batch_size=256
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 5000/5000 [00:01<00:00, 2554.43it/s]
(diminishing returns, saturated the GPU)

์ฃผ๋กœ ์†๋„ ์ €ํ•˜๊ฐ€ ์žˆ๋Š” ์˜ˆ์‹œ:

class MyDataset(Dataset):
    def __len__(self):
        return 5000

    def __getitem__(self, i):
        if i % 64 == 0:
            n = 100
        else:
            n = 1
        return "This is a test" * n

์ด๋Š” ๋‹ค๋ฅธ ๋ฌธ์žฅ๋“ค์— ๋น„ํ•ด ๊ฐ„ํ—์ ์œผ๋กœ ๋งค์šฐ ๊ธด ๋ฌธ์žฅ์ด ํฌํ•จ๋œ ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ์ „์ฒด ๋ฐฐ์น˜๊ฐ€ 400ํ† ํฐ ๊ธธ์ด๋กœ
([64, 400]) ๋˜์–ด์•ผ ํ•˜๋ฏ€๋กœ, [64, 4] ๋Œ€์‹  [64, 400]์ด ๋˜์–ด ํฌ๊ฒŒ ์†๋„๊ฐ€ ์ €ํ•˜๋ฉ๋‹ˆ๋‹ค. ๊ฒŒ๋‹ค๊ฐ€, ๋” ํฐ ๋ฐฐ์น˜์—์„œ๋Š” ํ”„๋กœ๊ทธ๋žจ์ด ์ถฉ๋Œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

------------------------------
Streaming no batching
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1000/1000 [00:05<00:00, 183.69it/s]
------------------------------
Streaming batch_size=8
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1000/1000 [00:03<00:00, 265.74it/s]
------------------------------
Streaming batch_size=64
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1000/1000 [00:26<00:00, 37.80it/s]
------------------------------
Streaming batch_size=256
  0%|                                                                                 | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/nicolas/src/transformers/test.py", line 42, in <module>
    for out in tqdm(pipe(dataset, batch_size=256), total=len(dataset)):
....
    q = q / math.sqrt(dim_per_head)  # (bs, n_heads, q_length, dim_per_head)
RuntimeError: CUDA out of memory. Tried to allocate 376.00 MiB (GPU 0; 3.95 GiB total capacity; 1.72 GiB already allocated; 354.88 MiB free; 2.46 GiB reserved in total by PyTorch)

์ผ๋ฐ˜์ ์ธ ํ•ด๊ฒฐ์ฑ…์€ ์—†์œผ๋ฉฐ, ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋”ฐ๋ผ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์‚ฌ์šฉ์ž๋ฅผ ์œ„ํ•œ ๊ฒฝํ—˜์ƒ ์ง€์นจ:

  • ํ•˜๋“œ์›จ์–ด์™€ ์‹ค์ œ ์›Œํฌ๋กœ๋“œ๋กœ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜์„ธ์š”. ์ธก์ •์ด ๋‹ต์ž…๋‹ˆ๋‹ค.

  • ์‹ค์‹œ๊ฐ„ ์ถ”๋ก (latency)์ด ์ค‘์š”ํ•˜๋‹ค๋ฉด ๋ฐฐ์น˜ ์ฒ˜๋ฆฌํ•˜์ง€ ๋งˆ์„ธ์š”.

  • CPU ์‚ฌ์šฉ ์‹œ์—๋„ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

  • GPU์—์„œ ์ •์  ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ(throughput)๊ฐ€ ๋ชฉ์ ์ด๋ผ๋ฉด

    • ์ž…๋ ฅ ์‹œํ€€์Šค ๊ธธ์ด("์‹ค์ œ" ๋ฐ์ดํ„ฐ)๋ฅผ ์ž˜ ๋ชจ๋ฅด๋Š” ๊ฒฝ์šฐ, ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌํ•˜์ง€ ๋ง๊ณ  ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜๋ฉด์„œ ์ž„์‹œ๋กœ ๋ฐฐ์น˜๋ฅผ ์ ์šฉํ•ด ๋ณด๊ณ , ์‹คํŒจ ์‹œ ์ด๋ฅผ ๋ณต๊ตฌํ•  ์ˆ˜ ์žˆ๋„๋ก OOM ๊ฒ€์‚ฌ ๋กœ์ง์„ ์ถ”๊ฐ€ํ•˜์„ธ์š”. (์‹œํ€€์Šค ๊ธธ์ด๋ฅผ ์ œ์–ดํ•˜์ง€ ์•Š์œผ๋ฉด ์–ธ์  ๊ฐ€๋Š” ์‹คํŒจํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.)
    • ์‹œํ€€์Šค ๊ธธ์ด๊ฐ€ ์ผ์ •ํ•˜๋‹ค๋ฉด ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ๊ฐ€ ์œ ๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ธก์ •ํ•˜๋ฉฐ OOM๊นŒ์ง€ ์‹œ๋„ํ•ด ๋ณด์„ธ์š”.
    • GPU ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํด์ˆ˜๋ก ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ์˜ ์ด์ ์ด ํฝ๋‹ˆ๋‹ค.
  • ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ ํ™œ์„ฑํ™” ์‹œ OOM์„ ํ•ธ๋“ค๋งํ•  ์ˆ˜ ์žˆ๋„๋ก ๋Œ€๋น„ํ•˜์„ธ์š”.

ํŒŒ์ดํ”„๋ผ์ธ ์ฒญํฌ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ [[pipeline-chunk-batching]]

์ œ๋กœ์ƒท ๋ถ„๋ฅ˜ ๋ฐ ์งˆ์˜์‘๋‹ต ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹จ์ผ ์ž…๋ ฅ์ด ์—ฌ๋Ÿฌ ํฌ์›Œ๋“œ ํŒจ์Šค๋ฅผ ์œ ๋ฐœํ•  ์ˆ˜ ์žˆ์–ด ๋ฐฐ์น˜ ํฌ๊ธฐ ์ธ์ž๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‘ ํŒŒ์ดํ”„๋ผ์ธ์€ ์ฒญํฌ ํŒŒ์ดํ”„๋ผ์ธ ํ˜•ํƒœ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์š”์•ฝํ•˜๋ฉด

preprocessed = pipe.preprocess(inputs)
model_outputs = pipe.forward(preprocessed)
outputs = pipe.postprocess(model_outputs)

์ด์ œ ๋‚ด๋ถ€์ ์œผ๋กœ๋Š”

all_model_outputs = []
for preprocessed in pipe.preprocess(inputs):
    model_outputs = pipe.forward(preprocessed)
    all_model_outputs.append(model_outputs)
outputs = pipe.postprocess(all_model_outputs)

ํŒŒ์ดํ”„๋ผ์ธ์˜ ์‚ฌ์šฉ ๋ฐฉ์‹์ด ๋™์ผํ•˜๋ฏ€๋กœ, ์ฝ”๋“œ์—๋Š” ๊ฑฐ์˜ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ์€ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ๋ฅผ ์ž๋™์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ž…๋ ฅ์ด ๋ช‡ ๋ฒˆ์˜ ํฌ์›Œ๋“œ ํŒจ์Šค๋ฅผ ๋ฐœ์ƒ์‹œํ‚ค๋Š”์ง€ ๊ณ ๋ คํ•  ํ•„์š” ์—†์ด, ๋ฐฐ์น˜ ํฌ๊ธฐ๋Š” ์ž…๋ ฅ๊ณผ ๋ฌด๊ด€ํ•˜๊ฒŒ ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ ์•ž์„œ ์–ธ๊ธ‰ํ•œ ์ฃผ์˜์‚ฌํ•ญ์€ ์—ฌ์ „ํžˆ ์œ ํšจํ•ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ FP16 ์ถ”๋ก  [[pipeline-fp16-inference]]

๋ชจ๋ธ์€ FP16 ๋ชจ๋“œ๋กœ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, GPU์—์„œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ ˆ์•ฝํ•˜๋ฉด์„œ ์ฒ˜๋ฆฌ ์†๋„๋ฅผ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๋ชจ๋ธ์€ ์„ฑ๋Šฅ ์ €ํ•˜ ์—†์ด FP16์„ ์ง€์›ํ•˜๋ฉฐ, ๋ชจ๋ธ์ด ํด์ˆ˜๋ก ์„ฑ๋Šฅ ์ €ํ•˜ ๊ฐ€๋Šฅ์„ฑ์€ ๋” ๋‚ฎ์•„์ง‘๋‹ˆ๋‹ค.

FP16 ์ถ”๋ก ์„ ํ™œ์„ฑํ™”ํ•˜๋ ค๋ฉด ํŒŒ์ดํ”„๋ผ์ธ ์ƒ์„ฑ์ž์— dtype=torch.float16 ๋˜๋Š” dtype='float16'์„ ์ „๋‹ฌํ•˜์„ธ์š”. ์ด ๊ธฐ๋Šฅ์€ ํŒŒ์ดํ† ์น˜ ๋ฐฑ์—”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋ธ์—์„œ๋งŒ ์ž‘๋™ํ•˜๋ฉฐ, ์ž…๋ ฅ์€ ๋‚ด๋ถ€์ ์œผ๋กœ FP16 ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜๋ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ์‚ฌ์šฉ์ž ์ •์˜ ์ฝ”๋“œ [[pipeline-custom-code]]

ํŠน์ • ํŒŒ์ดํ”„๋ผ์ธ์„ ์˜ค๋ฒ„๋ผ์ด๋“œํ•˜๋ ค๋ฉด, ๋จผ์ € ํ•ด๋‹น ์ž‘์—…์— ๋Œ€ํ•œ ์ด์Šˆ๋ฅผ ๋“ฑ๋กํ•ด ์ฃผ์„ธ์š”. ํŒŒ์ดํ”„๋ผ์ธ์˜ ๋ชฉํ‘œ๋Š” ๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ์šฉ ์‚ฌ๋ก€๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฒƒ์ด๋ฏ€๋กœ, transformers ํŒ€์ด ์ถ”๊ฐ€ ์ง€์›์„ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ„๋‹จํžˆ ์‹œ๋„ํ•˜๋ ค๋ฉด ํŒŒ์ดํ”„๋ผ์ธ ํด๋ž˜์Šค๋ฅผ ์ƒ์†ํ•˜์„ธ์š”.

class MyPipeline(TextClassificationPipeline):
    def postprocess():
        # ์‚ฌ์šฉ์ž ์ •์˜ ํ›„์ฒ˜๋ฆฌ ์ฝ”๋“œ ์ž‘์„ฑ
        scores = scores * 100
        # ์ถ”๊ฐ€ ์ฝ”๋“œ ์ž‘์„ฑ

my_pipeline = MyPipeline(model=model, tokenizer=tokenizer, ...)
# ๋˜๋Š” *pipeline* ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ:
my_pipeline = pipeline(model="xxxx", pipeline_class=MyPipeline)

์ด๋ฅผ ํ†ตํ•ด ์›ํ•˜๋Š” ๋ชจ๋“  ์ปค์Šคํ…€ ์ฝ”๋“œ๋ฅผ ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌํ˜„ํ•˜๊ธฐ [[implementing-a-pipeline]]

์ƒˆ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌํ˜„

์˜ค๋””์˜ค [[audio]]

์˜ค๋””์˜ค ์ž‘์—…์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

AudioClassificationPipeline [[transformers.AudioClassificationPipeline]]

[[autodoc]] AudioClassificationPipeline - call - all

AutomaticSpeechRecognitionPipeline [[transformers.AutomaticSpeechRecognitionPipeline]]

[[autodoc]] AutomaticSpeechRecognitionPipeline - call - all

TextToAudioPipeline [[transformers.TextToAudioPipeline]]

[[autodoc]] TextToAudioPipeline - call - all

ZeroShotAudioClassificationPipeline [[transformers.ZeroShotAudioClassificationPipeline]]

[[autodoc]] ZeroShotAudioClassificationPipeline - call - all

์ปดํ“จํ„ฐ ๋น„์ „ [[computer-vision]]

์ปดํ“จํ„ฐ ๋น„์ „ ์ž‘์—…์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

DepthEstimationPipeline [[transformers.DepthEstimationPipeline]]

[[autodoc]] DepthEstimationPipeline - call - all

ImageClassificationPipeline [[transformers.ImageClassificationPipeline]]

[[autodoc]] ImageClassificationPipeline - call - all

ImageSegmentationPipeline [[transformers.ImageSegmentationPipeline]]

[[autodoc]] ImageSegmentationPipeline - call - all

ImageToImagePipeline [[transformers.ImageToImagePipeline]]

[[autodoc]] ImageToImagePipeline - call - all

ObjectDetectionPipeline [[transformers.ObjectDetectionPipeline]]

[[autodoc]] ObjectDetectionPipeline - call - all

VideoClassificationPipeline [[transformers.VideoClassificationPipeline]]

[[autodoc]] VideoClassificationPipeline - call - all

ZeroShotImageClassificationPipeline [[transformers.ZeroShotImageClassificationPipeline]]

[[autodoc]] ZeroShotImageClassificationPipeline - call - all

ZeroShotObjectDetectionPipeline [[transformers.ZeroShotObjectDetectionPipeline]]

[[autodoc]] ZeroShotObjectDetectionPipeline - call - all

์ž์—ฐ์–ด ์ฒ˜๋ฆฌ [[natural-language-processing]]

์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

FillMaskPipeline [[transformers.FillMaskPipeline]]

[[autodoc]] FillMaskPipeline - call - all

QuestionAnsweringPipeline [[transformers.QuestionAnsweringPipeline]]

[[autodoc]] QuestionAnsweringPipeline - call - all

SummarizationPipeline [[transformers.SummarizationPipeline]]

[[autodoc]] SummarizationPipeline - call - all

TableQuestionAnsweringPipeline [[transformers.TableQuestionAnsweringPipeline]]

[[autodoc]] TableQuestionAnsweringPipeline - call

TextClassificationPipeline [[transformers.TextClassificationPipeline]]

[[autodoc]] TextClassificationPipeline - call - all

TextGenerationPipeline [[transformers.TextGenerationPipeline]]

[[autodoc]] TextGenerationPipeline - call - all

Text2TextGenerationPipeline [[transformers.Text2TextGenerationPipeline]]

[[autodoc]] Text2TextGenerationPipeline - call - all

TokenClassificationPipeline [[transformers.TokenClassificationPipeline]]

[[autodoc]] TokenClassificationPipeline - call - all

TranslationPipeline [[transformers.TranslationPipeline]]

[[autodoc]] TranslationPipeline - call - all

ZeroShotClassificationPipeline [[transformers.ZeroShotClassificationPipeline]]

[[autodoc]] ZeroShotClassificationPipeline - call - all

๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ [[multimodal]]

๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž‘์—…์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

DocumentQuestionAnsweringPipeline [[transformers.DocumentQuestionAnsweringPipeline]]

[[autodoc]] DocumentQuestionAnsweringPipeline - call - all

FeatureExtractionPipeline [[transformers.FeatureExtractionPipeline]]

[[autodoc]] FeatureExtractionPipeline - call - all

ImageFeatureExtractionPipeline [[transformers.ImageFeatureExtractionPipeline]]

[[autodoc]] ImageFeatureExtractionPipeline - call - all

ImageToTextPipeline [[transformers.ImageToTextPipeline]]

[[autodoc]] ImageToTextPipeline - call - all

ImageTextToTextPipeline [[transformers.ImageTextToTextPipeline]]

[[autodoc]] ImageTextToTextPipeline - call - all

MaskGenerationPipeline [[transformers.MaskGenerationPipeline]]

[[autodoc]] MaskGenerationPipeline - call - all

VisualQuestionAnsweringPipeline [[transformers.VisualQuestionAnsweringPipeline]]

[[autodoc]] VisualQuestionAnsweringPipeline - call - all

Parent class: Pipeline [[transformers.Pipeline]]

[[autodoc]] Pipeline