Instructions to use rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot") model = AutoModelForCausalLM.from_pretrained("rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot
- SGLang
How to use rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot with Docker Model Runner:
docker model run hf.co/rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot
Llama-3.2-3B-Instruct-Legal-Chatbot
Model Description
์ด ๋ชจ๋ธ์ meta-llama/Llama-3.2-3B-Instruct๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ๊ตญ ๋ฏผ์ฌ๋ฒ ๋๋ฉ์ธ ์ง์์ ํนํ๋๋๋ก Fine-tuning(LoRA)๋ ๋ชจ๋ธ์
๋๋ค.
์ฌ์ฉ์์ ๋ฒ๋ฅ ์ ์ง์์ ๋ํด ์ ๋ฌธ์ ์ธ ๋ต๋ณ์ ์ ๊ณตํ๋ฉฐ, ๊ด๋ จ ๋ฒ๋ น, ํ๋ก, ๊ดํ ๋ฒ์ ๋ฑ์ ๋ฒ๋ฅ ๊ฐ์ฒด(Legal Entities)๋ฅผ ์ถ์ถํ์ฌ ํจ๊ป ์ ์ํ ์ ์๋๋ก ํ์ต๋์์ต๋๋ค.
- Base Model:
meta-llama/Llama-3.2-3B-Instruct - Language: Korean (ko)
- Task: Question Answering, Text Generation, Legal Entity Extraction
- Training Method: Supervised Fine-Tuning (SFT) with PEFT(LoRA)
Intended Use & Limitations
Intended Use
- ํ๊ตญ ๋ฏผ์ฌ๋ฒ ๊ด๋ จ ๊ธฐ์ด์ ์ธ ๋ฒ๋ฅ ์ง์์๋ต
- ๋ฒ๋ฅ ๋ฌธ์ ์์ฝ ๋ฐ ์ฃผ์ ๋ฒ๋ น/ํ๋ก ๋ฒํธ ์ถ์ถ
- ๋ฒ๋ฅ AI ์ด์์คํดํธ ์ฐ๊ตฌ ๋ฐ ๊ต์ก์ฉ ๋ ํผ๋ฐ์ค
Limitations & Ethical Considerations
- ์ฃผ์์ฌํญ: ์ด ๋ชจ๋ธ์ ๊ต์ก ๋ฐ ์ฐ๊ตฌ ๋ชฉ์ ์ผ๋ก ๊ฐ๋ฐ๋์์ผ๋ฉฐ, ์ ๋ฌธ์ ์ธ ๋ฒ๋ฅ ์๋ด์ด๋ ๋ณํธ์ฌ๋ฅผ ๋์ฒดํ ์ ์์ต๋๋ค. * LLM์ ํน์ฑ์ ํ๊ฐ(Hallucination) ํ์์ด ๋ฐ์ํ์ฌ ์กด์ฌํ์ง ์๋ ๋ฒ๋ น์ด๋ ํ๋ก๋ฅผ ์์ฑํ ์ ์์ต๋๋ค. ์ค์ ๋ฒ์ ์กฐ์น๊ฐ ํ์ํ ๊ฒฝ์ฐ ๋ฐ๋์ ๋ฒ๋ฅ ์ ๋ฌธ๊ฐ(๋ณํธ์ฌ ๋ฑ)์ ์กฐ์ธ์ ๊ตฌํด์ผ ํฉ๋๋ค.
Training Details
Training Data
- ์ถ์ฒ: AI Hub (ํ๊ตญ ๋ฏผ์ฌ๋ฒ ์ง์์๋ต ๋ฐ ํ๋ก ๋ผ๋ฒจ๋ง ๋ฐ์ดํฐ)
- ์ง์์๋ต ์นดํ ๊ณ ๋ฆฌ 75,624๊ฑด
- rudalson/legal-qa-1k-dataset - ํ ์คํธ ์ฉ์ผ๋ก ์ํ๋ง ๋ฐ์ดํฐ์
- ์ ์ฒ๋ฆฌ: ์กฐํญ ๋ฒํธ, ๋ ์ง, ๊ธ์ก ํ์ ๋ฑ์ ์ ๊ทํํ์์ผ๋ฉฐ, Llama 3์ Chat Template(
system-user-assistant) ๊ตฌ์กฐ๋ก ๋ณํํ์ฌ ํ์ต์ ์งํํ์ต๋๋ค.
Training Procedure & Hyperparameters
๋ชจ๋ธ ํ์ต์ Hugging Face์ peft, trl (SFTTrainer) ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ์ฌ์ฉํ์ฌ ์งํ๋์์ต๋๋ค.
- LoRA Parameters:
r: 16lora_alpha: 32target_modules:q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_projlora_dropout: 0.05
- Training Hyperparameters:
learning_rate: 2e-4num_train_epochs: 1per_device_train_batch_size: 4gradient_accumulation_steps: 4optimizer: adamw_torchfp16/bfloat16: Enabled
Evaluation
ํ๊ฐ๋ ROUGE ์ค์ฝ์ด ๋ฐ ๋ฒ๋ฅ ๊ฐ์ฒด๋ช (๋ฒ๋ น, ํ๋ก, ๋ฒ์ ๋ฑ) ์ถ์ถ ์ ํ๋๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ์ํ๋์์ต๋๋ค.
- ROUGE-1: 0.2222
- ROUGE-L: 0.2222
How to Get Started with the Model
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# ํ๊น
ํ์ด์ค์ ์
๋ก๋ํ ๋ชจ๋ธ ๊ฒฝ๋ก
model_name = "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot"
# 1. ๋ชจ๋ธ ๋ฐ ํ ํฌ๋์ด์ ๋ก๋
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
low_cpu_mem_usage=True
)
# 2. ์ถ๋ก ํ
์คํธ
prompt = "๊ณ์ฝ ํด์ ์ ์ํด๋ฐฐ์์ ์ฒญ๊ตฌํ ์ ์๋์?"
messages = [
{"role": "system", "content": "๋น์ ์ ํ๊ตญ ๋ฒ๋ฅ ์ ๋ฌธ๊ฐ AI ์ด์์คํดํธ์
๋๋ค. ์ฌ์ฉ์์ ์ง๋ฌธ์ ๋ํด ์ ํํ๊ณ ์ ๋ฌธ์ ์ธ ๋ต๋ณ์ ์ ๊ณตํ์ธ์."},
{"role": "user", "content": prompt}
]
# ์ฑํ
ํ
ํ๋ฆฟ ์ ์ฉ
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
# ๋ต๋ณ ์์ฑ
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.1,
do_sample=True
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(f"์ง๋ฌธ: {prompt}")
print(f"๋ต๋ณ: {response.strip()}")
- Downloads last month
- 165
Model tree for rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot
Base model
meta-llama/Llama-3.2-3B-Instruct