yeongseok11
/

hyperclovax-1.5b-erp-nl2sql

Generated from Trainer

Model card Files Files and versions

hyperclovax-1.5b-erp-nl2sql / README.md

yeongseok11's picture

Update README.md

17a0eec verified 23 days ago

|

history blame contribute delete

3.58 kB

	---
	license: cc-by-nc-4.0
	library_name: peft
	base_model: naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B
	tags:
	- text-to-sql
	- erp
	- hyperclova
	- korean
	- nlp
	- lora
	- generated_from_trainer
	---

	# HyperCLOVAX-1.5B-ERP-SQL 🚀

	이 모델은 [naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B)를 기반으로 한국어 ERP 도메인의 Text-to-SQL 작업을 위해 파인튜닝된 모델입니다.

	1.5B라는 초경량 모델임에도 불구하고, 파인튜닝 후 0.5%에서 62.0%로 극적인 성능 향상을 달성했습니다. 특히 복잡한 추론(Lv 5) 영역에서는 2B급 모델들을 상회하는 효율성을 보여줍니다.

	## 📊 모델 성능 (Dramatic Improvement)

	자체 구축한 ERP-SQL 데이터셋 평가 결과입니다. 사전 학습(Baseline) 상태에서는 도메인 지식이 없어 거의 정답을 맞히지 못했으나, 학습 후 실무 투입 가능한 수준으로 환골탈태하였습니다.

	\| 모델 (Model) \| 학습 상태 \| 전체 정확도 \| Lv 1 (쉬움) \| Lv 5 (매우 어려움) \|
	\| :--- \| :--- \| :--- \| :--- \| :--- \|
	\| HyperCLOVA X 1.5B \| Baseline \| 0.5% \| 2.5% \| 0.0% \|
	\| HyperCLOVA X 1.5B \| Fine-tuned (Ours) \| 62.0% \| 92.5% \| 47.5% \|

	> 핵심 분석: 전체 정확도는 2B급 모델 대비 소폭 낮을 수 있으나, 고난이도(Lv 5) 추론 정확도(47.5%)는 2.1B 경쟁 모델(45.0%)보다 오히려 높게 측정되었습니다. 이는 모델의 파라미터 밀도가 매우 효율적임을 시사합니다.

	## 🔧 학습 정보 (Training Details)
	* 베이스 모델: naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B
	* 학습 방법: LoRA (Low-Rank Adaptation)
	* 최적 에폭(Epoch): 5 (지속적인 성능 우상향 확인)
	* 데이터셋: 스키마가 반영된 합성(Synthetic) 한국어 ERP 질문-쿼리 쌍
	* 하드웨어: NVIDIA RTX 4060 Ti (16GB) x 2ea

	## 💻 사용 가이드 (How to Use)

	### 1. 라이브러리 설치
	```bash
	pip install torch transformers peft accelerate

	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# 1. 모델 로드
	base_model_id = "naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B"
	adapter_id = "yeongseok11/hyperclovax-1.5b-erp-nl2sql"

	tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
	base_model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	model = PeftModel.from_pretrained(base_model, adapter_id)
	model.eval()

	# 2. 프롬프트 정의
	schema_context = """
	[Tables]
	employees(emp_id, name, dept_id, hire_date, salary)
	departments(dept_id, dept_name, location)
	"""
	question = "인사팀 직원들의 이름을 알려줘."

	prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	아래 스키마를 참고하여 질문을 SQL로 변환하세요.

	### Input:
	### 질문:
	{question}

	### 스키마:
	{schema_context}

	### Response:
	"""

	# 3. 추론
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=256,
	do_sample=False,
	eos_token_id=tokenizer.eos_token_id
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:")[-1].strip())