λꡬμ RAG[[Tools-and-RAG]]
[~PreTrainedTokenizerBase.apply_chat_template] λ©μλλ μ±ν
λ©μμ§ μΈμλ λ¬Έμμ΄, 리μ€νΈ, λμ
λ리 λ± κ±°μ λͺ¨λ μ’
λ₯μ μΆκ° μΈμ νμ
μ μ§μν©λλ€. μ΄λ₯Ό ν΅ν΄ λ€μν μ¬μ© μν©μμ μ±ν
ν
νλ¦Ώμ νμ©ν μ μμ΅λλ€.
μ΄ κ°μ΄λμμλ λꡬ λ° κ²μ μ¦κ° μμ±(RAG)κ³Ό ν¨κ» μ±ν ν νλ¦Ώμ μ¬μ©νλ λ°©λ²μ 보μ¬λ립λλ€.
λꡬ[[Tools]]
λꡬλ λκ·λͺ¨ μΈμ΄ λͺ¨λΈ(LLM)μ΄ νΉμ μμ μ μννκΈ° μν΄ νΈμΆν μ μλ ν¨μμ λλ€. μ΄λ μ€μκ° μ 보, κ³μ° λꡬ λλ λκ·λͺ¨ λ°μ΄ν°λ² μ΄μ€ μ κ·Ό λ±μ ν΅ν΄ λνν μμ΄μ νΈμ κΈ°λ₯μ νμ₯νλ κ°λ ₯ν λ°©λ²μ λλ€.
λꡬλ₯Ό λ§λ€ λλ μλ κ·μΉμ λ°λ₯΄μΈμ.
- ν¨μλ κΈ°λ₯μ μ μ€λͺ νλ μ΄λ¦μ κ°μ ΈμΌ ν©λλ€.
- ν¨μμ μΈμλ ν¨μ ν€λμ νμ
ννΈλ₯Ό ν¬ν¨ν΄μΌ ν©λλ€(
ArgsλΈλ‘μλ ν¬ν¨νμ§ λ§μΈμ). - ν¨μμλ Google μ€νμΌ μ λ μ€νΈλ§(docstring)μ΄ ν¬ν¨λμ΄μΌ ν©λλ€.
- ν¨μμ λ°ν νμ
κ³Ό
ReturnsλΈλ‘μ ν¬ν¨ν μ μμ§λ§, λꡬλ₯Ό νμ©νλ λλΆλΆμ λͺ¨λΈμμ μ΄λ₯Ό μ¬μ©νμ§ μκΈ° λλ¬Έμ 무μν μ μμ΅λλ€.
μ£Όμ΄μ§ μμΉμ νμ¬ μ¨λμ νμμ κ°μ Έμ€λ λꡬμ μμλ μλμ κ°μ΅λλ€.
def get_current_temperature(location: str, unit: str) -> float:
"""
μ£Όμ΄μ§ μμΉμ νμ¬ μ¨λλ₯Ό κ°μ Έμ΅λλ€.
Args:
location: μ¨λλ₯Ό κ°μ Έμ¬ μμΉ, "λμ, κ΅κ°" νμ
unit: μ¨λλ₯Ό λ°νν λ¨μ. (μ νμ§: ["celsius(μμ¨)", "fahrenheit(νμ¨)"])
Returns:
μ£Όμ΄μ§ μμΉμ μ§μ λ λ¨μλ‘ νμλ νμ¬ μ¨λ(float μλ£ν).
"""
return 22. # μ€μ ν¨μλΌλ©΄ μλ§ μ§μ§λ‘ κΈ°μ¨μ κ°μ ΈμμΌκ² μ£ !
def get_current_wind_speed(location: str) -> float:
"""
μ£Όμ΄μ§ μμΉμ νμ¬ νμμ km/h λ¨μλ‘ κ°μ Έμ΅λλ€.
Args:
location: μ¨λλ₯Ό κ°μ Έμ¬ μμΉ, "λμ, κ΅κ°" νμ
Returns:
μ£Όμ΄μ§ μμΉμ νμ¬ νμ(km/h, float μλ£ν).
"""
return 6. # μ€μ ν¨μλΌλ©΄ μλ§ μ§μ§λ‘ νμμ κ°μ ΈμμΌκ² μ£ !
tools = [get_current_temperature, get_current_wind_speed]
NousResearch/Hermes-2-Pro-Llama-3-8Bμ κ°μ΄ λꡬ μ¬μ©μ μ§μνλ λͺ¨λΈκ³Ό ν ν¬λμ΄μ λ₯Ό κ°μ Έμ€μΈμ. νλμ¨μ΄κ° μ§μλλ€λ©΄ Command-Rμ΄λ Mixtral-8x22Bμ κ°μ λ ν° λͺ¨λΈλ κ³ λ €ν μ μμ΅λλ€.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained( "NousResearch/Hermes-2-Pro-Llama-3-8B")
tokenizer = AutoTokenizer.from_pretrained( "NousResearch/Hermes-2-Pro-Llama-3-8B")
model = AutoModelForCausalLM.from_pretrained( "NousResearch/Hermes-2-Pro-Llama-3-8B", torch_dtype=torch.bfloat16, device_map="auto")
μ±ν λ©μμ§λ₯Ό μμ±ν©λλ€.
messages = [
{"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
{"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
]
messagesμ λꡬ λͺ©λ‘ toolsλ₯Ό [~PreTrainedTokenizerBase.apply_chat_template]μ μ λ¬ν λ€, μ΄λ₯Ό λͺ¨λΈμ μ
λ ₯μΌλ‘ μ¬μ©νμ¬ ν
μ€νΈλ₯Ό μμ±ν μ μμ΅λλ€.
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v for k, v in inputs.items()}
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):]))
<tool_call>
{"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"}
</tool_call><|im_end|>
μ±ν
λͺ¨λΈμ λ
μ€νΈλ§(docstring)μ μ μλ νμμ λ°λΌ get_current_temperature ν¨μμ μ¬λ°λ₯Έ λ§€κ°λ³μλ₯Ό μ λ¬ν΄ νΈμΆνμ΅λλ€. ν리λ₯Ό κΈ°μ€μΌλ‘ μμΉλ₯Ό νλμ€λ‘ μΆλ‘ νμΌλ©°, μ¨λ λ¨μλ μμ¨λ₯Ό μ¬μ©ν΄μΌ νλ€κ³ νλ¨νμ΅λλ€.
μ΄μ get_current_temperature ν¨μμ ν΄λΉ μΈμλ€μ tool_call λμ
λ리μ λ΄μ μ±ν
λ©μμ§μ μΆκ°ν©λλ€. tool_call λμ
λ리λ systemμ΄λ userκ° μλ assistant μν λ‘ μ 곡λμ΄μΌ ν©λλ€.
OpenAI APIλ
tool_callνμμΌλ‘ JSON λ¬Έμμ΄μ μ¬μ©ν©λλ€. Transformersμμ μ¬μ©ν κ²½μ° λμ λ리λ₯Ό μꡬνκΈ° λλ¬Έμ, μ€λ₯κ° λ°μνκ±°λ λͺ¨λΈμ΄ μ΄μνκ² λμν μ μμ΅λλ€.
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})
μ΄μμ€ν΄νΈκ° ν¨μ μΆλ ₯μ μ½κ³ μ¬μ©μμ μ±ν ν μ μλλ‘ ν©λλ€.
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
The temperature in Paris, France right now is approximately 12Β°C (53.6Β°F).<|im_end|>
Mistral λ° Mixtral λͺ¨λΈμ κ²½μ° μΆκ°μ μΌλ‘ tool_call_idκ° νμν©λλ€. tool_call_idλ 9μ리 μμ«μ λ¬Έμμ΄λ‘ μμ±λμ΄ tool_call λμ
λ리μ id ν€μ ν λΉλ©λλ€.
tool_call_id = "9Ae3bDc2F"
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "id": tool_call_id, "function": tool_call}]})
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
μ€ν€λ§[[Schema]]
[~PreTrainedTokenizerBase.apply_chat_template]μ ν¨μλ₯Ό JSON μ€ν€λ§λ‘ λ³ννμ¬ μ±ν
ν
νλ¦Ώμ μ λ¬ν©λλ€. LLMμ ν¨μ λ΄λΆμ μ½λλ₯Ό λ³΄μ§ λͺ»ν©λλ€. λ€μ λ§ν΄, LLMμ ν¨μκ° κΈ°μ μ μΌλ‘ μ΄λ»κ² μλνλμ§λ μ κ²½ μ°μ§ μκ³ , ν¨μμ μ μμ μΈμλ§ μ°Έμ‘°ν©λλ€.
ν¨μκ° μμ λμ΄λ κ·μΉμ λ°λ₯΄λ©΄, λ΄λΆμμ JSON μ€ν€λ§κ° μλμΌλ‘ μμ±λ©λλ€. νμ§λ§ λ λμ κ°λ μ±μ΄λ λλ²κΉ μ μν΄ get_json_schemaλ₯Ό μ¬μ©νμ¬ μ€ν€λ§λ₯Ό μλμΌλ‘ λ³νν μ μμ΅λλ€.
from transformers.utils import get_json_schema
def multiply(a: float, b: float):
"""
λ μ«μλ₯Ό κ³±νλ ν¨μ
Args:
a: κ³±ν 첫 λ²μ§Έ μ«μ
b: κ³±ν λ λ²μ§Έ μ«μ
"""
return a * b
schema = get_json_schema(multiply)
print(schema)
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
μ€ν€λ§λ₯Ό νΈμ§νκ±°λ μ²μλΆν° μ§μ μμ±ν μ μμ΅λλ€. μ΄λ₯Ό ν΅ν΄ λ 볡μ‘ν ν¨μμ λν μ νν μ€ν€λ§λ₯Ό μ μ°νκ² μ μν μ μμ΅λλ€.
ν¨μ μκ·Έλμ²λ₯Ό λ¨μνκ² μ μ§νκ³ μΈμλ₯Ό μ΅μνμΌλ‘ μ μ§νμΈμ. μ΄λ¬ν ν¨μλ μ€μ²©λ μΈμλ₯Ό κ°μ§ 볡μ‘ν ν¨μμ λΉν΄ λͺ¨λΈμ΄ λ μ½κ² μ΄ν΄νκ³ μ¬μ©ν μ μμ΅λλ€.
μλ μμλ μ€ν€λ§λ₯Ό μλμΌλ‘ μμ±ν λ€μ [~PreTrainedTokenizerBase.apply_chat_template]μ μ λ¬νλ λ°©λ²μ 보μ¬μ€λλ€.
# μΈμλ₯Ό λ°μ§ μλ κ°λ¨ν ν¨μ
current_time = {
"type": "function",
"function": {
"name": "current_time",
"description": "Get the current local time as a string.",
"parameters": {
'type': 'object',
'properties': {}
}
}
}
# λ κ°μ μ«μ μΈμλ₯Ό λ°λ λ μμ ν ν¨μ
multiply = {
'type': 'function',
'function': {
'name': 'multiply',
'description': 'A function that multiplies two numbers',
'parameters': {
'type': 'object',
'properties': {
'a': {
'type': 'number',
'description': 'The first number to multiply'
},
'b': {
'type': 'number', 'description': 'The second number to multiply'
}
},
'required': ['a', 'b']
}
}
}
model_input = tokenizer.apply_chat_template(
messages,
tools = [current_time, multiply]
)
RAG[[RAG]]
κ²μ μ¦κ° μμ±(Retrieval-augmented generation, RAG) λͺ¨λΈμ 쿼리λ₯Ό λ°ννκΈ° μ μ λ¬Έμλ₯Ό κ²μν΄ μΆκ° μ 보λ₯Ό μ»μ΄ λͺ¨λΈμ΄ κΈ°μ‘΄μ κ°μ§κ³ μλ μ§μμ νμ₯μν΅λλ€. RAG λͺ¨λΈμ κ²½μ°, [~PreTrainedTokenizerBase.apply_chat_template]μ documents λ§€κ°λ³μλ₯Ό μΆκ°νμΈμ. μ΄ documents λ§€κ°λ³μλ λ¬Έμ λͺ©λ‘μ΄μ΄μΌ νλ©°, κ° λ¬Έμλ titleκ³Ό content ν€λ₯Ό κ°μ§ λ¨μΌ λμ
λ리μ¬μΌ ν©λλ€.
RAGλ₯Ό μν
documentsλ§€κ°λ³μλ νλκ² μ§μλμ§ μμΌλ©° λ§μ λͺ¨λΈλ€μ΄documentsλ₯Ό 무μνλ μ±ν ν νλ¦Ώμ κ°μ§κ³ μμ΅λλ€. λͺ¨λΈμ΄documentsλ₯Ό μ§μνλμ§ νμΈνλ €λ©΄ λͺ¨λΈ μΉ΄λλ₯Ό μ½κ±°λprint(tokenizer.chat_template)λ₯Ό μ€ννμ¬documentsν€κ° μλμ§ νμΈνμΈμ. Command-Rκ³Ό Command-R+λ λͺ¨λ RAG μ±ν ν νλ¦Ώμμdocumentsλ₯Ό μ§μν©λλ€.
λͺ¨λΈμ μ λ¬ν λ¬Έμ λͺ©λ‘μ μμ±νμΈμ.
documents = [
{
"title": "The Moon: Our Age-Old Foe",
"text": "Man has always dreamed of destroying the moon. In this essay, I shall..."
},
{
"title": "The Sun: Our Age-Old Friend",
"text": "Although often underappreciated, the sun provides several notable benefits..."
}
]
[~PreTrainedTokenizerBase.apply_chat_template]μμ chat_template="rag"λ₯Ό μ€μ νκ³ μλ΅μ μμ±νμΈμ.
from transformers import AutoTokenizer, AutoModelForCausalLM
# λͺ¨λΈκ³Ό ν ν¬λμ΄μ λ‘λ
tokenizer = AutoTokenizer.from_pretrained("CohereForAI/c4ai-command-r-v01-4bit")
model = AutoModelForCausalLM.from_pretrained("CohereForAI/c4ai-command-r-v01-4bit", device_map="auto")
device = model.device # λͺ¨λΈμ κ°μ Έμ¨ μ₯μΉ νμΈ
# λν μ
λ ₯ μ μ
conversation = [
{"role": "user", "content": "What has Man always dreamed of?"}
]
input_ids = tokenizer.apply_chat_template(
conversation=conversation,
documents=documents,
chat_template="rag",
tokenize=True,
add_generation_prompt=True,
return_tensors="pt").to(device)
# μλ΅ μμ±
generated_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
# μμ±λ ν
μ€νΈλ₯Ό λμ½λ©νκ³ μμ± ν둬ννΈμ ν¨κ» μΆλ ₯
generated_text = tokenizer.decode(generated_tokens[0])
print(generated_text)