DrDavis's picture
Upload folder using huggingface_hub
17c6d62 verified

λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ(LLM) ν”„λ‘¬ν”„νŒ… κ°€μ΄λ“œ [[llm-prompting-guide]]

[[open-in-colab]]

Falcon, LLaMA λ“±μ˜ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ€ 사전 ν›ˆλ ¨λœ 트랜슀포머 λͺ¨λΈλ‘œ, μ΄ˆκΈ°μ—λŠ” μ£Όμ–΄μ§„ μž…λ ₯ ν…μŠ€νŠΈμ— λŒ€ν•΄ λ‹€μŒ 토큰을 μ˜ˆμΈ‘ν•˜λ„λ‘ ν›ˆλ ¨λ©λ‹ˆλ‹€. 이듀은 보톡 μˆ˜μ‹­μ–΅ 개의 λ§€κ°œλ³€μˆ˜λ₯Ό κ°€μ§€κ³  있으며, μž₯기간에 걸쳐 수쑰 개의 ν† ν°μœΌλ‘œ ν›ˆλ ¨λ©λ‹ˆλ‹€. κ·Έ κ²°κ³Ό, 이 λͺ¨λΈλ“€μ€ 맀우 κ°•λ ₯ν•˜κ³  λ‹€μž¬λ‹€λŠ₯ν•΄μ Έμ„œ, μžμ—°μ–΄ ν”„λ‘¬ν”„νŠΈλ‘œ λͺ¨λΈμ— μ§€μ‹œν•˜μ—¬ λ‹€μ–‘ν•œ μžμ—°μ–΄ 처리 μž‘μ—…μ„ μ¦‰μ‹œ μˆ˜ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

졜적의 좜λ ₯을 보μž₯ν•˜κΈ° μœ„ν•΄ μ΄λŸ¬ν•œ ν”„λ‘¬ν”„νŠΈλ₯Ό μ„€κ³„ν•˜λŠ” 것을 ν”νžˆ "ν”„λ‘¬ν”„νŠΈ μ—”μ§€λ‹ˆμ–΄λ§"이라고 ν•©λ‹ˆλ‹€. ν”„λ‘¬ν”„νŠΈ μ—”μ§€λ‹ˆμ–΄λ§μ€ μƒλ‹Ήν•œ μ‹€ν—˜μ΄ ν•„μš”ν•œ 반볡적인 κ³Όμ •μž…λ‹ˆλ‹€. μžμ—°μ–΄λŠ” ν”„λ‘œκ·Έλž˜λ° 언어보닀 훨씬 μœ μ—°ν•˜κ³  ν‘œν˜„λ ₯이 ν’λΆ€ν•˜μ§€λ§Œ, λ™μ‹œμ— λͺ¨ν˜Έμ„±μ„ μ΄ˆλž˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ, μžμ—°μ–΄ ν”„λ‘¬ν”„νŠΈλŠ” 변화에 맀우 λ―Όκ°ν•©λ‹ˆλ‹€. ν”„λ‘¬ν”„νŠΈμ˜ μ‚¬μ†Œν•œ μˆ˜μ •λ§ŒμœΌλ‘œλ„ μ™„μ „νžˆ λ‹€λ₯Έ 좜λ ₯이 λ‚˜μ˜¬ 수 μžˆμŠ΅λ‹ˆλ‹€.

λͺ¨λ“  κ²½μš°μ— μ μš©ν•  수 μžˆλŠ” μ •ν™•ν•œ ν”„λ‘¬ν”„νŠΈ 생성 곡식은 μ—†μ§€λ§Œ, μ—°κ΅¬μžλ“€μ€ 더 μΌκ΄€λ˜κ²Œ 졜적의 κ²°κ³Όλ₯Ό μ–»λŠ” 데 도움이 λ˜λŠ” μ—¬λŸ¬ κ°€μ§€ λͺ¨λ²” 사둀λ₯Ό κ°œλ°œν–ˆμŠ΅λ‹ˆλ‹€.

이 κ°€μ΄λ“œμ—μ„œλŠ” 더 λ‚˜μ€ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ ν”„λ‘¬ν”„νŠΈλ₯Ό μž‘μ„±ν•˜κ³  λ‹€μ–‘ν•œ μžμ—°μ–΄ 처리 μž‘μ—…μ„ ν•΄κ²°ν•˜λŠ” 데 도움이 λ˜λŠ” ν”„λ‘¬ν”„νŠΈ μ—”μ§€λ‹ˆμ–΄λ§ λͺ¨λ²” 사둀λ₯Ό λ‹€λ£Ήλ‹ˆλ‹€:

ν”„λ‘¬ν”„νŠΈ μ—”μ§€λ‹ˆμ–΄λ§μ€ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ 좜λ ₯ μ΅œμ ν™” κ³Όμ •μ˜ 일뢀일 λΏμž…λ‹ˆλ‹€. 또 λ‹€λ₯Έ μ€‘μš”ν•œ ꡬ성 μš”μ†ŒλŠ” 졜적의 ν…μŠ€νŠΈ 생성 μ „λž΅μ„ μ„ νƒν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€. ν•™μŠ΅ κ°€λŠ₯ν•œ λ§€κ°œλ³€μˆ˜λ₯Ό μˆ˜μ •ν•˜μ§€ μ•Šκ³ λ„ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ΄ ν…μŠ€νŠΈλ₯Ό μƒμ„±ν•˜λ¦¬ λ•Œ 각각의 후속 토큰을 μ„ νƒν•˜λŠ” 방식을 μ‚¬μš©μžκ°€ 직접 μ •μ˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€. ν…μŠ€νŠΈ 생성 λ§€κ°œλ³€μˆ˜λ₯Ό μ‘°μ •ν•¨μœΌλ‘œμ¨ μƒμ„±λœ ν…μŠ€νŠΈμ˜ λ°˜λ³΅μ„ 쀄이고 더 μΌκ΄€λ˜κ³  μ‚¬λžŒμ΄ λ§ν•˜λŠ” 것 같은 ν…μŠ€νŠΈλ₯Ό λ§Œλ“€ 수 μžˆμŠ΅λ‹ˆλ‹€. ν…μŠ€νŠΈ 생성 μ „λž΅κ³Ό λ§€κ°œλ³€μˆ˜λŠ” 이 κ°€μ΄λ“œμ˜ λ²”μœ„λ₯Ό λ²—μ–΄λ‚˜μ§€λ§Œ, λ‹€μŒ κ°€μ΄λ“œμ—μ„œ μ΄λŸ¬ν•œ μ£Όμ œμ— λŒ€ν•΄ μžμ„Ένžˆ μ•Œμ•„λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€:

ν”„λ‘¬ν”„νŒ…μ˜ 기초 [[basics-of-prompting]]

λͺ¨λΈμ˜ μœ ν˜• [[types-of-models]]

ν˜„λŒ€μ˜ λŒ€λΆ€λΆ„μ˜ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ€ λ””μ½”λ”λ§Œμ„ μ΄μš©ν•œ νŠΈλžœμŠ€ν¬λ¨Έμž…λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ LLaMA, Llama2, Falcon, GPT2 등이 μžˆμŠ΅λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ Flan-T5와 BART와 같은 인코더-디코더 기반의 트랜슀포머 λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ„ μ ‘ν•  μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€.

인코더-디코더 기반의 λͺ¨λΈμ€ 일반적으둜 좜λ ₯이 μž…λ ₯에 크게 μ˜μ‘΄ν•˜λŠ” 생성 μž‘μ—…μ— μ‚¬μš©λ©λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, λ²ˆμ—­κ³Ό μš”μ•½ μž‘μ—…μ— μ‚¬μš©λ©λ‹ˆλ‹€. 디코더 μ „μš© λͺ¨λΈμ€ λ‹€λ₯Έ λͺ¨λ“  μœ ν˜•μ˜ 생성 μž‘μ—…μ— μ‚¬μš©λ©λ‹ˆλ‹€.

νŒŒμ΄ν”„λΌμΈμ„ μ‚¬μš©ν•˜μ—¬ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμœΌλ‘œ ν…μŠ€νŠΈλ₯Ό 생성할 λ•Œ, μ–΄λ–€ μœ ν˜•μ˜ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ„ μ‚¬μš©ν•˜κ³  μžˆλŠ”μ§€ μ•„λŠ” 것이 μ€‘μš”ν•©λ‹ˆλ‹€. μ™œλƒν•˜λ©΄ 이듀은 μ„œλ‘œ λ‹€λ₯Έ νŒŒμ΄ν”„λΌμΈμ„ μ‚¬μš©ν•˜κΈ° λ•Œλ¬Έμž…λ‹ˆλ‹€.

디코더 μ „μš© λͺ¨λΈλ‘œ 좔둠을 μ‹€ν–‰ν•˜λ €λ©΄ text-generation νŒŒμ΄ν”„λΌμΈμ„ μ‚¬μš©ν•˜μ„Έμš”:

>>> from transformers import pipeline
>>> import torch

>>> torch.manual_seed(0) # doctest: +IGNORE_RESULT

>>> generator = pipeline('text-generation', model = 'openai-community/gpt2')
>>> prompt = "Hello, I'm a language model"

>>> generator(prompt, max_length = 30)
[{'generated_text': "Hello, I'm a language model programmer so you can use some of my stuff. But you also need some sort of a C program to run."}]

인코더-λ””μ½”λ”λ‘œ 좔둠을 μ‹€ν–‰ν•˜λ €λ©΄ text2text-generation νŒŒμ΄ν”„λΌμΈμ„ μ‚¬μš©ν•˜μ„Έμš”:

>>> text2text_generator = pipeline("text2text-generation", model = 'google/flan-t5-base')
>>> prompt = "Translate from English to French: I'm very happy to see you"

>>> text2text_generator(prompt)
[{'generated_text': 'Je suis très heureuse de vous rencontrer.'}]

κΈ°λ³Έ λͺ¨λΈ vs μ§€μ‹œ/μ±„νŒ… λͺ¨λΈ [[base-vs-instructchat-models]]

πŸ€— Hubμ—μ„œ 졜근 μ‚¬μš© κ°€λŠ₯ν•œ λŒ€λΆ€λΆ„μ˜ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ μ²΄ν¬ν¬μΈνŠΈλŠ” κΈ°λ³Έ 버전과 μ§€μ‹œ(λ˜λŠ” μ±„νŒ…) 두 κ°€μ§€ 버전이 μ œκ³΅λ©λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, tiiuae/falcon-7b와 tiiuae/falcon-7b-instructκ°€ μžˆμŠ΅λ‹ˆλ‹€.

κΈ°λ³Έ λͺ¨λΈμ€ 초기 ν”„λ‘¬ν”„νŠΈκ°€ μ£Όμ–΄μ‘Œμ„ λ•Œ ν…μŠ€νŠΈλ₯Ό μ™„μ„±ν•˜λŠ” 데 νƒμ›”ν•˜μ§€λ§Œ, μ§€μ‹œλ₯Ό 따라야 ν•˜κ±°λ‚˜ λŒ€ν™”ν˜• μ‚¬μš©μ΄ ν•„μš”ν•œ μžμ—°μ–΄ μ²˜λ¦¬μž‘μ—…μ—λŠ” 이상적이지 μ•ŠμŠ΅λ‹ˆλ‹€. μ΄λ•Œ μ§€μ‹œ(μ±„νŒ…) 버전이 ν•„μš”ν•©λ‹ˆλ‹€. μ΄λŸ¬ν•œ μ²΄ν¬ν¬μΈνŠΈλŠ” 사전 ν›ˆλ ¨λœ κΈ°λ³Έ 버전을 μ§€μ‹œμ‚¬ν•­κ³Ό λŒ€ν™” λ°μ΄ν„°λ‘œ μΆ”κ°€ λ―Έμ„Έ μ‘°μ •ν•œ κ²°κ³Όμž…λ‹ˆλ‹€. 이 좔가적인 λ―Έμ„Έ μ‘°μ •μœΌλ‘œ 인해 λ§Žμ€ μžμ—°μ–΄ 처리 μž‘μ—…μ— 더 μ ν•©ν•œ 선택이 λ©λ‹ˆλ‹€.

tiiuae/falcon-7b-instructλ₯Ό μ‚¬μš©ν•˜μ—¬ 일반적인 μžμ—°μ–΄ 처리 μž‘μ—…μ„ ν•΄κ²°ν•˜λŠ” 데 μ‚¬μš©ν•  수 μžˆλŠ” λͺ‡ κ°€μ§€ κ°„λ‹¨ν•œ ν”„λ‘¬ν”„νŠΈλ₯Ό μ‚΄νŽ΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

μžμ—°μ–΄ 처리 μž‘μ—… [[nlp-tasks]]

λ¨Όμ €, ν™˜κ²½μ„ μ„€μ •ν•΄ λ³΄κ² μŠ΅λ‹ˆλ‹€:

pip install -q transformers accelerate

λ‹€μŒμœΌλ‘œ, μ μ ˆν•œ νŒŒμ΄ν”„λΌμΈ("text-generation")을 μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ„ λ‘œλ“œν•˜κ² μŠ΅λ‹ˆλ‹€:

>>> from transformers import pipeline, AutoTokenizer
>>> import torch

>>> torch.manual_seed(0) # doctest: +IGNORE_RESULT
>>> model = "tiiuae/falcon-7b-instruct"

>>> tokenizer = AutoTokenizer.from_pretrained(model)
>>> pipe = pipeline(
...     "text-generation",
...     model=model,
...     tokenizer=tokenizer,
...     torch_dtype=torch.bfloat16,
...     device_map="auto",
... )

Falcon λͺ¨λΈμ€ bfloat16 데이터 νƒ€μž…μ„ μ‚¬μš©ν•˜μ—¬ ν›ˆλ ¨λ˜μ—ˆμœΌλ―€λ‘œ, 같은 νƒ€μž…μ„ μ‚¬μš©ν•˜λŠ” 것을 ꢌμž₯ν•©λ‹ˆλ‹€. 이λ₯Ό μœ„ν•΄μ„œλŠ” μ΅œμ‹  λ²„μ „μ˜ CUDAκ°€ ν•„μš”ν•˜λ©°, μ΅œμ‹  κ·Έλž˜ν”½ μΉ΄λ“œμ—μ„œ κ°€μž₯ 잘 μž‘λ™ν•©λ‹ˆλ‹€.

이제 νŒŒμ΄ν”„λΌμΈμ„ 톡해 λͺ¨λΈμ„ λ‘œλ“œν–ˆμœΌλ‹ˆ, ν”„λ‘¬ν”„νŠΈλ₯Ό μ‚¬μš©ν•˜μ—¬ μžμ—°μ–΄ 처리 μž‘μ—…μ„ ν•΄κ²°ν•˜λŠ” 방법을 μ‚΄νŽ΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

ν…μŠ€νŠΈ λΆ„λ₯˜ [[text-classification]]

ν…μŠ€νŠΈ λΆ„λ₯˜μ˜ κ°€μž₯ 일반적인 ν˜•νƒœ 쀑 ν•˜λ‚˜λŠ” 감정 λΆ„μ„μž…λ‹ˆλ‹€. μ΄λŠ” ν…μŠ€νŠΈ μ‹œν€€μŠ€μ— "긍정적", "뢀정적" λ˜λŠ” "쀑립적"κ³Ό 같은 λ ˆμ΄λΈ”μ„ ν• λ‹Ήν•©λ‹ˆλ‹€. μ£Όμ–΄μ§„ ν…μŠ€νŠΈ(μ˜ν™” 리뷰)λ₯Ό λΆ„λ₯˜ν•˜λ„둝 λͺ¨λΈμ— μ§€μ‹œν•˜λŠ” ν”„λ‘¬ν”„νŠΈλ₯Ό μž‘μ„±ν•΄ λ³΄κ² μŠ΅λ‹ˆλ‹€. λ¨Όμ € μ§€μ‹œμ‚¬ν•­μ„ μ œκ³΅ν•œ λ‹€μŒ, λΆ„λ₯˜ν•  ν…μŠ€νŠΈλ₯Ό μ§€μ •ν•˜κ² μŠ΅λ‹ˆλ‹€. μ—¬κΈ°μ„œ μ£Όλͺ©ν•  점은 λ‹¨μˆœνžˆ κ±°κΈ°μ„œ 끝내지 μ•Šκ³ , μ‘λ‹΅μ˜ μ‹œμž‘ 뢀뢄인 "Sentiment: "을 μΆ”κ°€ν•œλ‹€λŠ” κ²ƒμž…λ‹ˆλ‹€:

>>> torch.manual_seed(0)
>>> prompt = """Classify the text into neutral, negative or positive. 
... Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
... Sentiment:
... """

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=10,
... )

>>> for seq in sequences:
...     print(f"Result: {seq['generated_text']}")
Result: Classify the text into neutral, negative or positive. 
Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
Sentiment:
Positive

결과적으둜, μš°λ¦¬κ°€ μ§€μ‹œμ‚¬ν•­μ—μ„œ μ œκ³΅ν•œ λͺ©λ‘μ—μ„œ μ„ νƒλœ λΆ„λ₯˜ λ ˆμ΄λΈ”μ΄ μ •ν™•ν•˜κ²Œ ν¬ν•¨λ˜μ–΄ μƒμ„±λœ 것을 확인할 수 μžˆμŠ΅λ‹ˆλ‹€!

ν”„λ‘¬ν”„νŠΈ 외에도 max_new_tokens λ§€κ°œλ³€μˆ˜λ₯Ό μ „λ‹¬ν•˜λŠ” 것을 λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€. 이 λ§€κ°œλ³€μˆ˜λŠ” λͺ¨λΈμ΄ 생성할 ν† ν°μ˜ 수λ₯Ό μ œμ–΄ν•˜λ©°, ν…μŠ€νŠΈ 생성 μ „λž΅ κ°€μ΄λ“œμ—μ„œ 배울 수 μžˆλŠ” μ—¬λŸ¬ ν…μŠ€νŠΈ 생성 λ§€κ°œλ³€μˆ˜ 쀑 ν•˜λ‚˜μž…λ‹ˆλ‹€.

개체λͺ… 인식 [[named-entity-recognition]]

개체λͺ… 인식(Named Entity Recognition, NER)은 ν…μŠ€νŠΈμ—μ„œ 인물, μž₯μ†Œ, 쑰직과 같은 λͺ…λͺ…λœ 개체λ₯Ό μ°ΎλŠ” μž‘μ—…μž…λ‹ˆλ‹€. ν”„λ‘¬ν”„νŠΈμ˜ μ§€μ‹œμ‚¬ν•­μ„ μˆ˜μ •ν•˜μ—¬ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ΄ 이 μž‘μ—…μ„ μˆ˜ν–‰ν•˜λ„λ‘ ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. μ—¬κΈ°μ„œλŠ” return_full_text = False둜 μ„€μ •ν•˜μ—¬ 좜λ ₯에 ν”„λ‘¬ν”„νŠΈκ°€ ν¬ν•¨λ˜μ§€ μ•Šλ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€:

>>> torch.manual_seed(1) # doctest: +IGNORE_RESULT
>>> prompt = """Return a list of named entities in the text.
... Text: The Golden State Warriors are an American professional basketball team based in San Francisco.
... Named entities:
... """

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=15,
...     return_full_text = False,    
... )

>>> for seq in sequences:
...     print(f"{seq['generated_text']}")
- Golden State Warriors
- San Francisco

λ³΄μ‹œλ‹€μ‹œν”Ό, λͺ¨λΈμ΄ μ£Όμ–΄μ§„ ν…μŠ€νŠΈμ—μ„œ 두 개의 λͺ…λͺ…λœ 개체λ₯Ό μ •ν™•ν•˜κ²Œ μ‹λ³„ν–ˆμŠ΅λ‹ˆλ‹€.

λ²ˆμ—­ [[translation]]

λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ΄ μˆ˜ν–‰ν•  수 μžˆλŠ” 또 λ‹€λ₯Έ μž‘μ—…μ€ λ²ˆμ—­μž…λ‹ˆλ‹€. 이 μž‘μ—…μ„ μœ„ν•΄ 인코더-디코더 λͺ¨λΈμ„ μ‚¬μš©ν•  수 μžˆμ§€λ§Œ, μ—¬κΈ°μ„œλŠ” μ˜ˆμ‹œμ˜ λ‹¨μˆœμ„±μ„ μœ„ν•΄ κ½€ 쒋은 μ„±λŠ₯을 λ³΄μ΄λŠ” Falcon-7b-instructλ₯Ό 계속 μ‚¬μš©ν•˜κ² μŠ΅λ‹ˆλ‹€. λ‹€μ‹œ ν•œ 번, λͺ¨λΈμ—κ²Œ μ˜μ–΄μ—μ„œ μ΄νƒˆλ¦¬μ•„μ–΄λ‘œ ν…μŠ€νŠΈλ₯Ό λ²ˆμ—­ν•˜λ„λ‘ μ§€μ‹œν•˜λŠ” 기본적인 ν”„λ‘¬ν”„νŠΈλ₯Ό μž‘μ„±ν•˜λŠ” 방법은 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:

>>> torch.manual_seed(2) # doctest: +IGNORE_RESULT
>>> prompt = """Translate the English text to Italian.
... Text: Sometimes, I've believed as many as six impossible things before breakfast.
... Translation:
... """

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=20,
...     do_sample=True,
...     top_k=10,
...     return_full_text = False,
... )

>>> for seq in sequences:
...     print(f"{seq['generated_text']}")
A volte, ho creduto a sei impossibili cose prima di colazione.

μ—¬κΈ°μ„œλŠ” λͺ¨λΈμ΄ 좜λ ₯을 생성할 λ•Œ 쑰금 더 μœ μ—°ν•΄μ§ˆ 수 μžˆλ„λ‘ do_sample=True와 top_k=10을 μΆ”κ°€ν–ˆμŠ΅λ‹ˆλ‹€.

ν…μŠ€νŠΈ μš”μ•½ [[text-summarization]]

λ²ˆμ—­κ³Ό λ§ˆμ°¬κ°€μ§€λ‘œ, ν…μŠ€νŠΈ μš”μ•½μ€ 좜λ ₯이 μž…λ ₯에 크게 μ˜μ‘΄ν•˜λŠ” 또 λ‹€λ₯Έ 생성 μž‘μ—…μ΄λ©°, 인코더-디코더 기반 λͺ¨λΈμ΄ 더 λ‚˜μ€ 선택일 수 μžˆμŠ΅λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ 디코더 기반의 λͺ¨λΈλ„ 이 μž‘μ—…μ— μ‚¬μš©λ  수 μžˆμŠ΅λ‹ˆλ‹€. μ΄μ „μ—λŠ” ν”„λ‘¬ν”„νŠΈμ˜ 맨 μ²˜μŒμ— μ§€μ‹œμ‚¬ν•­μ„ λ°°μΉ˜ν–ˆμŠ΅λ‹ˆλ‹€. ν•˜μ§€λ§Œ ν”„λ‘¬ν”„νŠΈμ˜ 맨 끝도 μ§€μ‹œμ‚¬ν•­μ„ 넣을 μ μ ˆν•œ μœ„μΉ˜κ°€ 될 수 μžˆμŠ΅λ‹ˆλ‹€. 일반적으둜 μ§€μ‹œμ‚¬ν•­μ„ μ–‘ 극단 쀑 ν•˜λ‚˜μ— λ°°μΉ˜ν•˜λŠ” 것이 더 μ’‹μŠ΅λ‹ˆλ‹€.

>>> torch.manual_seed(3) # doctest: +IGNORE_RESULT
>>> prompt = """Permaculture is a design process mimicking the diversity, functionality and resilience of natural ecosystems. The principles and practices are drawn from traditional ecological knowledge of indigenous cultures combined with modern scientific understanding and technological innovations. Permaculture design provides a framework helping individuals and communities develop innovative, creative and effective strategies for meeting basic needs while preparing for and mitigating the projected impacts of climate change.
... Write a summary of the above text.
... Summary:
... """

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=30,
...     do_sample=True,
...     top_k=10,
...     return_full_text = False,
... )

>>> for seq in sequences:
...     print(f"{seq['generated_text']}")
Permaculture is an ecological design mimicking natural ecosystems to meet basic needs and prepare for climate change. It is based on traditional knowledge and scientific understanding.

질의 응닡 [[question-answering]]

질의 응닡 μž‘μ—…μ„ μœ„ν•΄ ν”„λ‘¬ν”„νŠΈλ₯Ό λ‹€μŒκ³Ό 같은 논리적 κ΅¬μ„±μš”μ†Œλ‘œ ꡬ쑰화할 수 μžˆμŠ΅λ‹ˆλ‹€. μ§€μ‹œμ‚¬ν•­, λ§₯락, 질문, 그리고 λͺ¨λΈμ΄ λ‹΅λ³€ 생성을 μ‹œμž‘ν•˜λ„λ‘ μœ λ„ν•˜λŠ” 선도 λ‹¨μ–΄λ‚˜ ꡬ문("Answer:") 을 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€:

>>> torch.manual_seed(4) # doctest: +IGNORE_RESULT
>>> prompt = """Answer the question using the context below.
... Context: Gazpacho is a cold soup and drink made of raw, blended vegetables. Most gazpacho includes stale bread, tomato, cucumbers, onion, bell peppers, garlic, olive oil, wine vinegar, water, and salt. Northern recipes often include cumin and/or pimentΓ³n (smoked sweet paprika). Traditionally, gazpacho was made by pounding the vegetables in a mortar with a pestle; this more laborious method is still sometimes used as it helps keep the gazpacho cool and avoids the foam and silky consistency of smoothie versions made in blenders or food processors.
... Question: What modern tool is used to make gazpacho?
... Answer:
... """

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=10,
...     do_sample=True,
...     top_k=10,
...     return_full_text = False,
... )

>>> for seq in sequences:
...     print(f"Result: {seq['generated_text']}")
Result: Modern tools often used to make gazpacho include

μΆ”λ‘  [[reasoning]]

좔둠은 λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ(LLM)μ—κ²Œ κ°€μž₯ μ–΄λ €μš΄ μž‘μ—… 쀑 ν•˜λ‚˜μ΄λ©°, 쒋은 κ²°κ³Όλ₯Ό μ–»κΈ° μœ„ν•΄μ„œλŠ” μ’…μ’… μƒκ°μ˜ μ‚¬μŠ¬(Chain-of-thought, CoT)κ³Ό 같은 κ³ κΈ‰ ν”„λ‘¬ν”„νŒ… 기법을 μ μš©ν•΄μ•Ό ν•©λ‹ˆλ‹€. κ°„λ‹¨ν•œ μ‚°μˆ  μž‘μ—…μ— λŒ€ν•΄ 기본적인 ν”„λ‘¬ν”„νŠΈλ‘œ λͺ¨λΈμ΄ μΆ”λ‘ ν•  수 μžˆλŠ”μ§€ μ‹œλ„ν•΄ λ³΄κ² μŠ΅λ‹ˆλ‹€:

>>> torch.manual_seed(5) # doctest: +IGNORE_RESULT
>>> prompt = """There are 5 groups of students in the class. Each group has 4 students. How many students are there in the class?"""

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=30,
...     do_sample=True,
...     top_k=10,
...     return_full_text = False,
... )

>>> for seq in sequences:
...     print(f"Result: {seq['generated_text']}")
Result: 
There are a total of 5 groups, so there are 5 x 4=20 students in the class.

μ •ν™•ν•œ 닡변이 μƒμ„±λ˜μ—ˆμŠ΅λ‹ˆλ‹€! λ³΅μž‘μ„±μ„ 쑰금 높여보고 기본적인 ν”„λ‘¬ν”„νŠΈλ‘œλ„ μ—¬μ „νžˆ ν•΄κ²°ν•  수 μžˆλŠ”μ§€ 확인해 λ³΄κ² μŠ΅λ‹ˆλ‹€:

>>> torch.manual_seed(6)
>>> prompt = """I baked 15 muffins. I ate 2 muffins and gave 5 muffins to a neighbor. My partner then bought 6 more muffins and ate 2. How many muffins do we now have?"""

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=10,
...     do_sample=True,
...     top_k=10,
...     return_full_text = False,
... )

>>> for seq in sequences:
...     print(f"Result: {seq['generated_text']}")
Result: 
The total number of muffins now is 21

정닡은 12μ—¬μ•Ό ν•˜λŠ”λ° 21μ΄λΌλŠ” 잘λͺ»λœ 닡변이 λ‚˜μ™”μŠ΅λ‹ˆλ‹€. 이 경우, ν”„λ‘¬ν”„νŠΈκ°€ λ„ˆλ¬΄ κΈ°λ³Έμ μ΄κ±°λ‚˜ λͺ¨λΈμ˜ 크기가 μž‘μ•„μ„œ 생긴 문제일 수 μžˆμŠ΅λ‹ˆλ‹€. μš°λ¦¬λŠ” Falcon의 κ°€μž₯ μž‘μ€ 버전을 μ„ νƒν–ˆμŠ΅λ‹ˆλ‹€. 좔둠은 큰 λͺ¨λΈμ—κ²Œλ„ μ–΄λ €μš΄ μž‘μ—…μ΄μ§€λ§Œ, 더 큰 λͺ¨λΈλ“€μ΄ 더 λ‚˜μ€ μ„±λŠ₯을 보일 κ°€λŠ₯성이 λ†’μŠ΅λ‹ˆλ‹€.

λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ ν”„λ‘¬ν”„νŠΈ μž‘μ„±μ˜ λͺ¨λ²” 사둀 [[best-practices-of-llm-prompting]]

이 μ„Ήμ…˜μ—μ„œλŠ” ν”„λ‘¬ν”„νŠΈ κ²°κ³Όλ₯Ό ν–₯μƒμ‹œν‚¬ 수 μžˆλŠ” λͺ¨λ²” 사둀 λͺ©λ‘μ„ μž‘μ„±ν–ˆμŠ΅λ‹ˆλ‹€:

  • μž‘μ—…ν•  λͺ¨λΈμ„ 선택할 λ•Œ μ΅œμ‹  및 κ°€μž₯ κ°•λ ₯ν•œ λͺ¨λΈμ΄ 더 λ‚˜μ€ μ„±λŠ₯을 λ°œνœ˜ν•  κ°€λŠ₯성이 λ†’μŠ΅λ‹ˆλ‹€.
  • κ°„λ‹¨ν•˜κ³  짧은 ν”„λ‘¬ν”„νŠΈλ‘œ μ‹œμž‘ν•˜μ—¬ μ μ§„μ μœΌλ‘œ κ°œμ„ ν•΄ λ‚˜κ°€μ„Έμš”.
  • ν”„λ‘¬ν”„νŠΈμ˜ μ‹œμž‘ λΆ€λΆ„μ΄λ‚˜ 맨 끝에 μ§€μ‹œμ‚¬ν•­μ„ λ°°μΉ˜ν•˜μ„Έμš”. λŒ€κ·œλͺ¨ μ»¨ν…μŠ€νŠΈλ₯Ό λ‹€λ£° λ•Œ, λͺ¨λΈλ“€μ€ μ–΄ν…μ…˜ λ³΅μž‘λ„κ°€ 2차적으둜 μ¦κ°€ν•˜λŠ” 것을 λ°©μ§€ν•˜κΈ° μœ„ν•΄ λ‹€μ–‘ν•œ μ΅œμ ν™”λ₯Ό μ μš©ν•©λ‹ˆλ‹€. μ΄λ ‡κ²Œ ν•¨μœΌλ‘œμ¨ λͺ¨λΈμ΄ ν”„λ‘¬ν”„νŠΈμ˜ 쀑간보닀 μ‹œμž‘μ΄λ‚˜ 끝 뢀뢄에 더 주의λ₯Ό 기울일 수 μžˆμŠ΅λ‹ˆλ‹€.
  • μ§€μ‹œμ‚¬ν•­μ„ μ μš©ν•  ν…μŠ€νŠΈμ™€ λͺ…ν™•ν•˜κ²Œ λΆ„λ¦¬ν•΄λ³΄μ„Έμš”. (이에 λŒ€ν•΄μ„œλŠ” λ‹€μŒ μ„Ήμ…˜μ—μ„œ 더 μžμ„Ένžˆ λ‹€λ£Ήλ‹ˆλ‹€.)
  • μž‘μ—…κ³Ό μ›ν•˜λŠ” 결과에 λŒ€ν•΄ ꡬ체적이고 ν’λΆ€ν•œ μ„€λͺ…을 μ œκ³΅ν•˜μ„Έμš”. ν˜•μ‹, 길이, μŠ€νƒ€μΌ, μ–Έμ–΄ 등을 λͺ…ν™•ν•˜κ²Œ μž‘μ„±ν•΄μ•Ό ν•©λ‹ˆλ‹€.
  • λͺ¨ν˜Έν•œ μ„€λͺ…κ³Ό μ§€μ‹œμ‚¬ν•­μ„ ν”Όν•˜μ„Έμš”.
  • "ν•˜μ§€ 말라"λŠ” μ§€μ‹œλ³΄λ‹€λŠ” "무엇을 ν•΄μ•Ό ν•˜λŠ”μ§€"λ₯Ό λ§ν•˜λŠ” μ§€μ‹œλ₯Ό μ‚¬μš©ν•˜λŠ” 것이 μ’‹μŠ΅λ‹ˆλ‹€.
  • 첫 번째 단어λ₯Ό μ“°κ±°λ‚˜ 첫 번째 λ¬Έμž₯을 μ‹œμž‘ν•˜μ—¬ 좜λ ₯을 μ˜¬λ°”λ₯Έ λ°©ν–₯으둜 "μœ λ„"ν•˜μ„Έμš”.
  • 퓨샷(Few-shot) ν”„λ‘¬ν”„νŒ… 및 μƒκ°μ˜ μ‚¬μŠ¬(Chain-of-thought, CoT) 같은 κ³ κΈ‰ κΈ°μˆ μ„ μ‚¬μš©ν•΄λ³΄μ„Έμš”.
  • ν”„λ‘¬ν”„νŠΈμ˜ 견고성을 ν‰κ°€ν•˜κΈ° μœ„ν•΄ λ‹€λ₯Έ λͺ¨λΈλ‘œλ„ ν…ŒμŠ€νŠΈν•˜μ„Έμš”.
  • ν”„λ‘¬ν”„νŠΈμ˜ 버전을 κ΄€λ¦¬ν•˜κ³  μ„±λŠ₯을 μΆ”μ ν•˜μ„Έμš”.

κ³ κΈ‰ ν”„λ‘¬ν”„νŠΈ 기법 [[advanced-prompting-techniques]]

퓨샷(Few-shot) ν”„λ‘¬ν”„νŒ… [[few-shot-prompting]]

μœ„ μ„Ήμ…˜μ˜ κΈ°λ³Έ ν”„λ‘¬ν”„νŠΈλ“€μ€ "μ œλ‘œμƒ·(Zero-shot)" ν”„λ‘¬ν”„νŠΈμ˜ μ˜ˆμ‹œμž…λ‹ˆλ‹€. μ΄λŠ” λͺ¨λΈμ— μ§€μ‹œμ‚¬ν•­κ³Ό λ§₯락은 μ£Όμ–΄μ‘Œμ§€λ§Œ, 해결책이 ν¬ν•¨λœ μ˜ˆμ‹œλŠ” μ œκ³΅λ˜μ§€ μ•Šμ•˜λ‹€λŠ” μ˜λ―Έμž…λ‹ˆλ‹€. μ§€μ‹œ λ°μ΄ν„°μ…‹μœΌλ‘œ λ―Έμ„Έ μ‘°μ •λœ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ€ 일반적으둜 μ΄λŸ¬ν•œ "μ œλ‘œμƒ·" μž‘μ—…μ—μ„œ 쒋은 μ„±λŠ₯을 λ³΄μž…λ‹ˆλ‹€. ν•˜μ§€λ§Œ μ—¬λŸ¬λΆ„μ˜ μž‘μ—…μ΄ 더 λ³΅μž‘ν•˜κ±°λ‚˜ λ―Έλ¬˜ν•œ 차이가 μžˆμ„ 수 있고, μ•„λ§ˆλ„ μ§€μ‹œμ‚¬ν•­λ§ŒμœΌλ‘œλŠ” λͺ¨λΈμ΄ ν¬μ°©ν•˜μ§€ λͺ»ν•˜λŠ” 좜λ ₯에 λŒ€ν•œ μš”κ΅¬μ‚¬ν•­μ΄ μžˆμ„ 수 μžˆμŠ΅λ‹ˆλ‹€. 이런 κ²½μš°μ—λŠ” 퓨샷(Few-shot) ν”„λ‘¬ν”„νŒ…μ΄λΌλŠ” 기법을 μ‹œλ„ν•΄ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€.

퓨샷 ν”„λ‘¬ν”„νŒ…μ—μ„œλŠ” ν”„λ‘¬ν”„νŠΈμ— μ˜ˆμ‹œλ₯Ό μ œκ³΅ν•˜μ—¬ λͺ¨λΈμ— 더 λ§Žμ€ λ§₯락을 μ£Όκ³  μ„±λŠ₯을 ν–₯μƒμ‹œν‚΅λ‹ˆλ‹€. 이 μ˜ˆμ‹œλ“€μ€ λͺ¨λΈμ΄ μ˜ˆμ‹œμ˜ νŒ¨ν„΄μ„ 따라 좜λ ₯을 μƒμ„±ν•˜λ„λ‘ μ‘°κ±΄ν™”ν•©λ‹ˆλ‹€.

λ‹€μŒμ€ μ˜ˆμ‹œμž…λ‹ˆλ‹€:

>>> torch.manual_seed(0) # doctest: +IGNORE_RESULT
>>> prompt = """Text: The first human went into space and orbited the Earth on April 12, 1961.
... Date: 04/12/1961
... Text: The first-ever televised presidential debate in the United States took place on September 28, 1960, between presidential candidates John F. Kennedy and Richard Nixon. 
... Date:"""

>>> sequences = pipe(
...     prompt,
...     max_new_tokens=8,
...     do_sample=True,
...     top_k=10,
... )

>>> for seq in sequences:
...     print(f"Result: {seq['generated_text']}")
Result: Text: The first human went into space and orbited the Earth on April 12, 1961.
Date: 04/12/1961
Text: The first-ever televised presidential debate in the United States took place on September 28, 1960, between presidential candidates John F. Kennedy and Richard Nixon. 
Date: 09/28/1960

μœ„μ˜ μ½”λ“œ μŠ€λ‹ˆνŽ«μ—μ„œλŠ” λͺ¨λΈμ— μ›ν•˜λŠ” 좜λ ₯을 보여주기 μœ„ν•΄ 단일 μ˜ˆμ‹œλ₯Ό μ‚¬μš©ν–ˆμœΌλ―€λ‘œ, 이λ₯Ό "원샷(One-shot)" ν”„λ‘¬ν”„νŒ…μ΄λΌκ³  λΆ€λ₯Ό 수 μžˆμŠ΅λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ μž‘μ—…μ˜ λ³΅μž‘μ„±μ— 따라 ν•˜λ‚˜ μ΄μƒμ˜ μ˜ˆμ‹œλ₯Ό μ‚¬μš©ν•΄μ•Ό ν•  μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€.

퓨샷 ν”„λ‘¬ν”„νŒ… κΈ°λ²•μ˜ ν•œκ³„:

  • λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ΄ μ˜ˆμ‹œμ˜ νŒ¨ν„΄μ„ νŒŒμ•…ν•  수 μžˆμ§€λ§Œ, 이 기법은 λ³΅μž‘ν•œ μΆ”λ‘  μž‘μ—…μ—λŠ” 잘 μž‘λ™ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.
  • 퓨샷 ν”„λ‘¬ν”„νŒ…μ„ μ μš©ν•˜λ©΄ ν”„λ‘¬ν”„νŠΈμ˜ 길이가 κΈΈμ–΄μ§‘λ‹ˆλ‹€. 토큰 μˆ˜κ°€ λ§Žμ€ ν”„λ‘¬ν”„νŠΈλŠ” κ³„μ‚°λŸ‰κ³Ό μ§€μ—° μ‹œκ°„μ„ μ¦κ°€μ‹œν‚¬ 수 있으며 ν”„λ‘¬ν”„νŠΈ 길이에도 μ œν•œμ΄ μžˆμŠ΅λ‹ˆλ‹€.
  • λ•Œλ‘œλŠ” μ—¬λŸ¬ μ˜ˆμ‹œκ°€ μ£Όμ–΄μ§ˆ λ•Œ, λͺ¨λΈμ€ μ˜λ„ν•˜μ§€ μ•Šμ€ νŒ¨ν„΄μ„ ν•™μŠ΅ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, μ„Έ 번째 μ˜ν™” 리뷰가 항상 뢀정적이라고 ν•™μŠ΅ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

μƒκ°μ˜ μ‚¬μŠ¬(Chain-of-thought, CoT) [[chain-of-thought]]

μƒκ°μ˜ μ‚¬μŠ¬(Chain-of-thought, CoT) ν”„λ‘¬ν”„νŒ…μ€ λͺ¨λΈμ΄ 쀑간 μΆ”λ‘  단계λ₯Ό μƒμ„±ν•˜λ„λ‘ μœ λ„ν•˜λŠ” κΈ°λ²•μœΌλ‘œ, λ³΅μž‘ν•œ μΆ”λ‘  μž‘μ—…μ˜ κ²°κ³Όλ₯Ό κ°œμ„ ν•©λ‹ˆλ‹€.

λͺ¨λΈμ΄ μΆ”λ‘  단계λ₯Ό μƒμ„±ν•˜λ„λ‘ μœ λ„ν•˜λŠ” 두 κ°€μ§€ 방법이 μžˆμŠ΅λ‹ˆλ‹€:

  • μ§ˆλ¬Έμ— λŒ€ν•œ μƒμ„Έν•œ 닡변을 μ˜ˆμ‹œλ‘œ μ œμ‹œν•˜λŠ” 퓨샷 ν”„λ‘¬ν”„νŒ…μ„ 톡해 λͺ¨λΈμ—κ²Œ 문제λ₯Ό μ–΄λ–»κ²Œ ν•΄κ²°ν•΄ λ‚˜κ°€λŠ”μ§€ λ³΄μ—¬μ€λ‹ˆλ‹€.
  • "λ‹¨κ³„λ³„λ‘œ 생각해 λ΄…μ‹œλ‹€" λ˜λŠ” "깊게 μˆ¨μ„ 쉬고 문제λ₯Ό λ‹¨κ³„λ³„λ‘œ ν•΄κ²°ν•΄ λ΄…μ‹œλ‹€"와 같은 문ꡬλ₯Ό μΆ”κ°€ν•˜μ—¬ λͺ¨λΈμ—κ²Œ μΆ”λ‘ ν•˜λ„λ‘ μ§€μ‹œν•©λ‹ˆλ‹€.

reasoning section의 λ¨Έν•€ μ˜ˆμ‹œμ— μƒκ°μ˜ μ‚¬μŠ¬(Chain-of-thought, CoT) 기법을 μ μš©ν•˜κ³  HuggingChatμ—μ„œ μ‚¬μš©ν•  수 μžˆλŠ” (tiiuae/falcon-180B-chat)κ³Ό 같은 더 큰 λͺ¨λΈμ„ μ‚¬μš©ν•˜λ©΄, μΆ”λ‘  κ²°κ³Όκ°€ 크게 κ°œμ„ λ©λ‹ˆλ‹€:

λ‹¨κ³„λ³„λ‘œ μ‚΄νŽ΄λ΄…μ‹œλ‹€:
1. μ²˜μŒμ— 15개의 머핀이 μžˆμŠ΅λ‹ˆλ‹€.
2. 2개의 머핀을 먹으면 13개의 머핀이 λ‚¨μŠ΅λ‹ˆλ‹€.
3. μ΄μ›ƒμ—κ²Œ 5개의 머핀을 μ£Όλ©΄ 8개의 머핀이 λ‚¨μŠ΅λ‹ˆλ‹€.
4. νŒŒνŠΈλ„ˆκ°€ 6개의 머핀을 더 μ‚¬μ˜€λ©΄ 총 λ¨Έν•€ μˆ˜λŠ” 14κ°œκ°€ λ©λ‹ˆλ‹€.
5. νŒŒνŠΈλ„ˆκ°€ 2개의 머핀을 먹으면 12개의 머핀이 λ‚¨μŠ΅λ‹ˆλ‹€.
λ”°λΌμ„œ, ν˜„μž¬ 12개의 머핀이 μžˆμŠ΅λ‹ˆλ‹€.

ν”„λ‘¬ν”„νŒ… vs λ―Έμ„Έ μ‘°μ • [[prompting-vs-fine-tuning]]

ν”„λ‘¬ν”„νŠΈλ₯Ό μ΅œμ ν™”ν•˜μ—¬ ν›Œλ₯­ν•œ κ²°κ³Όλ₯Ό 얻을 수 μžˆμ§€λ§Œ, μ—¬μ „νžˆ λͺ¨λΈμ„ λ―Έμ„Έ μ‘°μ •ν•˜λŠ” 것이 더 쒋을지 κ³ λ―Όν•  수 μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒμ€ 더 μž‘μ€ λͺ¨λΈμ„ λ―Έμ„Έ μ‘°μ •ν•˜λŠ” 것이 μ„ ν˜Έλ˜λŠ” μ‹œλ‚˜λ¦¬μ˜€μž…λ‹ˆλ‹€:

  • 도메인이 λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ΄ 사전 ν›ˆλ ¨λœ 것과 크게 λ‹€λ₯΄κ³  κ΄‘λ²”μœ„ν•œ ν”„λ‘¬ν”„νŠΈ μ΅œμ ν™”λ‘œλ„ μΆ©λΆ„ν•œ κ²°κ³Όλ₯Ό μ–»μ§€ λͺ»ν•œ 경우.
  • μ €μžμ› μ–Έμ–΄μ—μ„œ λͺ¨λΈμ΄ 잘 μž‘λ™ν•΄μ•Ό ν•˜λŠ” 경우.
  • μ—„κ²©ν•œ 규제 ν•˜μ— μžˆλŠ” λ―Όκ°ν•œ λ°μ΄ν„°λ‘œ λͺ¨λΈμ„ ν›ˆλ ¨ν•΄μ•Ό ν•˜λŠ” 경우.
  • λΉ„μš©, κ°œμΈμ •λ³΄ 보호, 인프라 λ˜λŠ” 기타 μ œν•œμœΌλ‘œ 인해 μž‘μ€ λͺ¨λΈμ„ μ‚¬μš©ν•΄μ•Ό ν•˜λŠ” 경우.

μœ„μ˜ λͺ¨λ“  μ˜ˆμ‹œμ—μ„œ, λͺ¨λΈμ„ λ―Έμ„Έ μ‘°μ •ν•˜κΈ° μœ„ν•΄ μΆ©λΆ„νžˆ 큰 도메인별 데이터셋을 이미 κ°€μ§€κ³  μžˆκ±°λ‚˜ 합리적인 λΉ„μš©μœΌλ‘œ μ‰½κ²Œ 얻을 수 μžˆλŠ”μ§€ 확인해야 ν•©λ‹ˆλ‹€. λ˜ν•œ λͺ¨λΈμ„ λ―Έμ„Έ μ‘°μ •ν•  μΆ©λΆ„ν•œ μ‹œκ°„κ³Ό μžμ›μ΄ ν•„μš”ν•©λ‹ˆλ‹€.

λ§Œμ•½ μœ„μ˜ μ˜ˆμ‹œλ“€μ΄ μ—¬λŸ¬λΆ„μ˜ κ²½μš°μ— ν•΄λ‹Ήν•˜μ§€ μ•ŠλŠ”λ‹€λ©΄, ν”„λ‘¬ν”„νŠΈλ₯Ό μ΅œμ ν™”ν•˜λŠ” 것이 더 μœ μ΅ν•  수 μžˆμŠ΅λ‹ˆλ‹€.