Upload folder using huggingface_hub

6e013ad verified 22 days ago

8.58 kB

	---
	language: ko
	license: mit
	tags:
	- function-calling
	- korean
	- banking
	- on-device
	- onnx
	- int8
	- webgpu
	base_model: google/functiongemma-270m-it
	---

	# TransferFunctionGemma

	한국어 자연어 이체 명령을 구조화된 function call로 변환하는 경량 모델입니다.

	## Model Description

	TransferFunctionGemma는 [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)를 한국어 금융 이체 도메인에 맞게 full fine-tuning한 모델입니다. 자연어 이체 명령을 분석하여 4종류의 function call JSON으로 변환합니다.

	ONNX INT8 양자화를 통해 약 418MB로 경량화되었으며, Transformers.js + WebGPU를 통해 브라우저에서 직접 추론할 수 있습니다. 서버 통신 없이 100% 클라이언트 사이드에서 동작합니다.

	### Supported Functions

	\| Function \| 설명 \| 필수 인자 \|
	\|----------\|------\|-----------\|
	\| `execute_transfer` \| 수취인에게 금액을 이체합니다 \| `recipient`, `amount` \|
	\| `query_history` \| 이체 내역을 조회합니다 \| (없음) \|
	\| `summarize_history` \| 이체 내역을 요약합니다 \| `period` \|
	\| `confirm_transfer` \| 대기 중인 이체를 확인/취소/수정합니다 \| `action` \|

	---

	## Intended Use

	### Primary Use Cases

	- 브라우저 기반 이체 데모: URL 접속만으로 자연어 이체 기능을 체험
	- 포트폴리오 데모: 기술 면접관/채용 담당자에게 온디바이스 AI 역량 시연
	- 온디바이스 AI 레퍼런스: FunctionGemma fine-tuning + 브라우저 배포 파이프라인 참고

	### Out-of-Scope Use

	- 실제 금융 거래에 사용하면 안 됩니다 (이 모델은 데모 전용입니다)
	- 이체 외의 금융 업무(대출, 투자, 보험 등)에는 학습되지 않았습니다
	- 영어 등 한국어 외 언어 입력은 지원하지 않습니다

	---

	## Training Data

	### Seed Data

	각 카테고리별 5~10개, 총 약 50~80개의 시드 데이터를 수작업으로 작성했습니다.

	\| 카테고리 \| 목표 샘플 수 \| 설명 \|
	\|----------\|-------------\|------\|
	\| transfer_simple \| 150 \| 기본 이체 ("엄마한테 5만원 보내줘") \|
	\| transfer_complex \| 100 \| 메모 포함, 복합 요청 \|
	\| confirm_cancel_modify \| 80 \| 확인/취소/수정 멀티턴 \|
	\| clarify \| 80 \| 정보 부족 시 자연어 되물음 \|
	\| query_history \| 80 \| 내역 조회 (기간/수취인 필터) \|
	\| summarize \| 60 \| 기간별 이체 요약 \|
	\| alias_diversity \| 100 \| 별명 변형 (엄마/어머니/맘) \|
	\| amount_parsing \| 100 \| 한국어 금액 (오만원/5만/삼백만) \|
	\| rejection \| 50 \| 이체 외 요청 거절 \|
	\| edge_cases \| 50 \| 오타, 비문, 혼합 요청 \|

	### Data Augmentation

	시드 데이터를 Claude API로 증강하여 500~1,000개 학습 샘플을 생성했습니다. 증강 시 다음을 변형합니다:

	- 말투: 존댓말/반말/줄임말
	- 오타 및 비문
	- 금액 표현 방식: 순한글, 숫자+한자, 혼합, 아라비아 숫자
	- 별명 변형

	### Data Format

	FunctionGemma chat template을 준수합니다:

	```jsonl
	{
	"messages": [
	{
	"role": "developer",
	"content": "You are a model that can do function calling with the following functions",
	"tool_definitions": [...]
	},
	{
	"role": "user",
	"content": "엄마한테 오만원 보내"
	},
	{
	"role": "assistant",
	"content": "",
	"function_calls": [
	{"name": "execute_transfer", "arguments": {"recipient": "엄마", "amount": 50000}}
	]
	}
	]
	}
	```

	### Validation

	모든 데이터는 자동 검증을 거칩니다:

	- JSON schema 유효성 검사
	- function name이 정의된 4개 중 하나인지 확인
	- amount가 양의 정수인지 확인
	- 한국어 금액 -> 숫자 변환 정확성 spot check

	---

	## Training Procedure

	### Base Model

	- 모델: google/functiongemma-270m-it
	- 학습 방식: Full fine-tuning (모델이 경량이므로 LoRA 없이 전체 파라미터 학습)

	### Hyperparameters

	\| 파라미터 \| 값 \|
	\|----------\|-----\|
	\| Epochs \| 5 \|
	\| Batch Size (per device) \| 8 \|
	\| Learning Rate \| 5e-5 \|
	\| LR Scheduler \| cosine \|
	\| Warmup Ratio \| 0.1 \|
	\| Weight Decay \| 0.01 \|
	\| Max Sequence Length \| 2048 \|
	\| Precision \| bfloat16 \|
	\| Eval Strategy \| epoch \|
	\| Save Strategy \| epoch \|
	\| Metric for Best Model \| eval_loss \|

	### Training Environment

	- 하드웨어: WSL (RAM 128GB / RTX 3070)
	- 소프트웨어: HuggingFace Transformers + TRL (SFTTrainer)

	### Quantization

	```bash
	# Fine-tuned 모델 -> ONNX 변환 + INT8 양자화
	python ml/scripts/convert_onnx.py
	```

	- ONNX 변환: optimum (`optimum.exporters.onnx`)
	- INT8 동적 양자화: onnxruntime (`quantize_dynamic`, `QuantType.QInt8`)
	- 최종 모델 크기: 418MB (ONNX INT8)
	- 참고: INT4는 onnxruntime과 Gemma weight layout 비호환으로 INT8 사용

	---

	## Evaluation Results

	### Base Model vs Fine-tuned Model 비교

	시드 데이터 50개 기준 (학습 데이터에 포함되지 않은 원본 시드)

	\| 메트릭 \| Base (functiongemma-270m-it) \| Fine-tuned \| 개선율 \|
	\|--------\|------------------------------\|------------\|--------\|
	\| Intent Accuracy \| 0.0% \| 88.9% \| +88.9%p \|
	\| JSON Validity \| 10.0% \| 100.0% \| +90.0%p \|
	\| Amount Parsing \| 0.0% \| 100.0% \| +100.0%p \|
	\| Argument F1 (macro) \| 0.0% \| 57.5% \| +57.5%p \|
	\| Rejection Accuracy \| 100.0% \| 100.0% \| +0.0%p \|

	#### Argument F1 세부

	\| 필드 \| F1 \|
	\|------\|-----\|
	\| recipient \| 83.8% \|
	\| amount \| 86.1% \|
	\| memo \| 75.0% \|
	\| period \| 100.0% \|
	\| action \| 0.0% \|
	\| new_amount \| 0.0% \|

	> `action`/`new_amount`는 `confirm_transfer` 함수 전용 인자로, 시드 데이터 내 해당 예제 부족이 원인입니다.

	---

	## Limitations

	- 도메인 제한: 이체 관련 명령만 처리 가능합니다. 그 외 금융 업무(대출, 투자 등)나 일반 대화에는 적합하지 않습니다.
	- 언어 제한: 한국어 입력만 지원합니다.
	- 브라우저 제한: WebGPU 지원 브라우저(Chrome 113+, Edge 113+)에서만 정상 동작합니다.
	- 데모 전용: Mock Banking Engine으로 시뮬레이션만 수행하며, 실제 금융 거래를 수행하지 않습니다.
	- 학습 데이터 편향: 시드 데이터와 Claude API 증강 기반이므로, 실제 사용자 발화 패턴과 차이가 있을 수 있습니다.
	- 복합 명령 제한: "엄마한테 5만원, 아빠한테 3만원 보내줘" 같은 복합 이체 명령은 지원하지 않습니다.

	---

	## How to Use

	### With Transformers.js (Browser)

	```javascript
	import { pipeline } from '@xenova/transformers';

	// 모델 로드 (WebGPU 자동 감지)
	const generator = await pipeline(
	'text-generation',
	'your-username/transfer-function-gemma-onnx-int4',
	{ device: 'webgpu' }
	);

	// 추론
	const messages = [
	{
	role: 'system',
	content: 'You are a model that can do function calling with the following functions: [execute_transfer, query_history, summarize_history, confirm_transfer]'
	},
	{
	role: 'user',
	content: '엄마한테 5만원 보내줘'
	}
	];

	const output = await generator(messages, {
	max_new_tokens: 256,
	temperature: 0.1,
	});

	console.log(output);
	// => {"name": "execute_transfer", "arguments": {"recipient": "엄마", "amount": 50000}}
	```

	### With Python (HuggingFace Transformers)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained(
	"your-username/transfer-function-gemma",
	torch_dtype="bfloat16",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(
	"your-username/transfer-function-gemma"
	)

	messages = [
	{"role": "user", "content": "엄마한테 5만원 보내줘"}
	]

	inputs = tokenizer.apply_chat_template(
	messages,
	return_tensors="pt",
	add_generation_prompt=True
	).to(model.device)

	outputs = model.generate(inputs, max_new_tokens=256, temperature=0.1)
	result = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(result)
	```

	---

	## Citation

	```bibtex
	@misc{transfer-function-gemma-2026,
	title={TransferFunctionGemma: On-Device Korean Banking Function Calling},
	author={Kimin Ryu},
	year={2026},
	url={https://github.com/your-username/TransferFunctionGemma}
	}
	```

	---

	## Acknowledgments

	- [Google Gemma](https://ai.google.dev/gemma) -- Base model
	- [HuggingFace Transformers](https://huggingface.co/docs/transformers) -- Training framework
	- [Transformers.js](https://huggingface.co/docs/transformers.js) -- Browser inference
	- [Anthropic Claude](https://anthropic.com) -- Data augmentation