YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Writing AI (μ¨μ )
κ΅μ΄ κ΅κ³Ό μμ ν λ΅μμ μ±μ νκ³ λ§μΆ€ν νΌλλ°±μ μμ±νλ LoRA μ΄λν°μ
λλ€.
Meta-Llama-3.1-8B-bnb-4bitλ₯Ό QLoRA(Unsloth)λ‘ νμΈνλνμ΅λλ€.
νμ΅ λ°μ΄ν°
- AI Hub μμ ν μ±μ λ°μ΄ν° λ° μ체 μμ§ λ°μ΄ν° (1μ ~ 4μ κ· ν)
- μ§μλ¬Έ + νμ λ΅μ + νΌλλ°± λ° μ΅μ’ μ μ νμμΌλ‘ κ°κ³΅
- 무μλ―Έν λ¨μ΄ λ°λ³΅, κΈμ μ λ리기 λ± 'κΌΌμ(κ°μ§ 1μ )' νν°λ§ μ μ© λ° 700μ κΈΈμ΄ λ°±μ μ μ© (v2)
μ±λ₯
| μ§ν | μμΉ | λΉκ³ |
|---|---|---|
| μΈμ μ νλ (Β±1μ ) | 87.5% | μΈκ° μ±μ κ΄ μμ€μ μ€λ¬΄ ν©κ²©μ |
| 1μ μ μ€λ₯ | 86.0% | |
| Macro F1-Score | 0.5348 |
νμ΅ μ€μ
| νλͺ© | κ° |
|---|---|
| λ°©μ | QLoRA (4-bit NF4) |
| LoRA rank | 32 |
| LoRA alpha | 32 |
| Epochs | 3 |
| Learning rate | 3e-5 |
| Optimizer | adamw_8bit |
| Max length | 1536 |
| Framework | unsloth / trl SFTTrainer |
μ¬μ© λ°©λ²
import torch
import re
from unsloth import FastLanguageModel
ADAPTER_PATH = "Onjeom/essay_scoring"
tokenizer = FastLanguageModel.from_pretrained(ADAPTER_PATH)[1]
model, _ = FastLanguageModel.from_pretrained(
model_name=ADAPTER_PATH,
max_seq_length=1536,
load_in_4bit=True,
device_map="cuda"
)
FastLanguageModel.for_inference(model)
RELAXED_INSTRUCTION = """μ£Όμ΄μ§ μ§μλ¬Έκ³Ό νμμ λ΅μμ λΆμνμ¬, λΆμ‘±ν μ κ³Ό κ°μ λ°©ν₯μ ν¬ν¨ν νΌλλ°±μ μμ±νκ³ λ§¨ λ§μ§λ§μ 1μ λΆν° 4μ μ¬μ΄μ μ΅μ’
μ μλ₯Ό λΆμ¬νμμ€.
[μ μ°νκ³ κ΄λν μ±μ κΈ°μ€]
- 4μ : μ§μλ¬Έμ ν΅μ¬ μꡬμ¬νμ μ νμ
νμκ³ μ λ°μ μΈ νλ¦μ΄ μ°μν λ΅μ (μ¬μν κ²°ν¨μ λκ·Έλ½κ² λ§μ μ²λ¦¬)
- 3μ : μ§μλ¬Έμ μ΄ν΄νμΌλ κ·Όκ±°κ° λ€μ νμ΄νκ±°λ λ
Όλ¦¬μ κΉμ΄κ° μμ¬μ΄ μΌλ°μ μΈ λ΅μ
- 2μ : μ§μλ¬Έμ ν€μλλ§ κ²¨μ° λμ΄νκ±°λ μ£Όμ₯μ κ·Όκ±°κ° μ¬κ°νκ² λΆμ‘±ν λ΅μ
- 1μ (μ΅νμ ): κ°μ λ§μ 무μλ―Ένκ² λ°λ³΅νκ±°λ κΌΌμκ° λͺ
λ°±ν λ΅μ"""
ALPACA_PROMPT = "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{}\n\n### Input:\n{}\n\n### Response:\n{}"
student_answer = "μ μΈμ₯μ μ€κΈ°μ λ¬Όμ μ μ₯ν΄μ μ¬λ§μμ μ΄ μ μλ€."
safe_input = student_answer[:700] # OOM λ°©μ§μ© 700μ μ ν
prompt = ALPACA_PROMPT.format(RELAXED_INSTRUCTION, safe_input, "")
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.1,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:\n")[-1].strip()
print(response)
ν둬ννΈ νμ
μ§μλ¬Έ: μ£Όμ΄μ§ μ§μλ¬Έκ³Ό νμμ λ΅μμ λΆμνμ¬... [μ μ°νκ³ κ΄λν μ±μ κΈ°μ€] ...
μ λ ₯:
{νμ λ΅μ ν μ€νΈ - νκ΅μ΄ ν ν° νλ° λ°©μ§λ₯Ό μν΄ 700μ μ ν κΆμ₯}
μΆλ ₯:
{κ°μ λ°©ν₯ λ° νΌλλ°± ν μ€νΈ}
[μ΅μ’ μ μ: 4]
API ν μ€νΈ λ°©λ²
- μλ² μ€ν
cd onjeom/api
pip install -r requirements.txt
cp .env.example .env
# HuggingFace λ‘κ·ΈμΈ (μ΅μ΄ 1ν)
huggingface-cli login
# λͺ¨λΈ ν¬ν¨ μ μ μ€ν
uvicorn app.main:app --reload
# λΉ λ₯Έ μ¬μμ (λΌμ°ν°/μ€ν€λ§ μμ μ, λͺ¨λΈ λ‘λ© μλ΅)
SKIP_MODEL_LOAD=1 uvicorn app.main:app --reload
μ²μ μ€ν μ λͺ¨λΈ μλ λ€μ΄λ‘λ (μ½ 5~10λΆ μμ).
β
μ±μ μμ§ μ€λΉ μλ£! λ©μμ§κ° λ¨λ©΄ μ€λΉλ κ±°μμ.
Swagger UI ν μ€νΈ
λΈλΌμ°μ μμ http://localhost:8000/docs μ μ
ν μ€νΈν μλν¬μΈνΈ ν΄λ¦
Try it out λ²νΌ ν΄λ¦
μμ λ°μ΄ν° λΆμ¬λ£κ³ Execute ν΄λ¦
μ£Όμ μλν¬μΈνΈ μμ
μμ ν μλ μ±μ POST /api/v1/score
"content": "μ μΈμ₯μ μ€κΈ°μ λ¬Όμ μ μ₯ν΄μ μ¬λ§μμ μ΄ μ μλ€."
AI νν° μ§λ¬Έ POST /api/tutor/ask
"question": "μΆλ‘ μ λ
ν΄λ 무μμΈκ°μ?",
"context": null
μ©μ΄ μ€λͺ POST /api/tutor/explain
"term": "μμ€λ²",
"context": "κΈμ΄μ΄λ μμ€λ²μ μ¬μ©νμ¬ μ£Όμ λ₯Ό κ°μ‘°νλ€."
컀리νλΌ μμ± POST /api/curriculum/generate
"theta": -0.5,
"daily_goal": 10,
"weak_areas": ["μΆλ‘ μ μ΄ν΄", "λΉνμ λ
ν΄"]
ν¬μ€ μ²΄ν¬ GET /health β {"status": "running"}
νμ μ΄λν° λ€μ΄λ‘λ
huggingface-cli download Onjeom/essay_scoring --local-dir ./models/essay_scoring