zeriworkspace's picture
Upload folder using huggingface_hub
6e013ad verified
---
language: ko
license: mit
tags:
- function-calling
- korean
- banking
- on-device
- onnx
- int8
- webgpu
base_model: google/functiongemma-270m-it
---
# TransferFunctionGemma
ํ•œ๊ตญ์–ด ์ž์—ฐ์–ด ์ด์ฒด ๋ช…๋ น์„ ๊ตฌ์กฐํ™”๋œ function call๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒฝ๋Ÿ‰ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
## Model Description
TransferFunctionGemma๋Š” [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)๋ฅผ ํ•œ๊ตญ์–ด ๊ธˆ์œต ์ด์ฒด ๋„๋ฉ”์ธ์— ๋งž๊ฒŒ full fine-tuningํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ž์—ฐ์–ด ์ด์ฒด ๋ช…๋ น์„ ๋ถ„์„ํ•˜์—ฌ 4์ข…๋ฅ˜์˜ function call JSON์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
ONNX INT8 ์–‘์žํ™”๋ฅผ ํ†ตํ•ด ์•ฝ 418MB๋กœ ๊ฒฝ๋Ÿ‰ํ™”๋˜์—ˆ์œผ๋ฉฐ, Transformers.js + WebGPU๋ฅผ ํ†ตํ•ด ๋ธŒ๋ผ์šฐ์ €์—์„œ ์ง์ ‘ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์„œ๋ฒ„ ํ†ต์‹  ์—†์ด 100% ํด๋ผ์ด์–ธํŠธ ์‚ฌ์ด๋“œ์—์„œ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
### Supported Functions
| Function | ์„ค๋ช… | ํ•„์ˆ˜ ์ธ์ž |
|----------|------|-----------|
| `execute_transfer` | ์ˆ˜์ทจ์ธ์—๊ฒŒ ๊ธˆ์•ก์„ ์ด์ฒดํ•ฉ๋‹ˆ๋‹ค | `recipient`, `amount` |
| `query_history` | ์ด์ฒด ๋‚ด์—ญ์„ ์กฐํšŒํ•ฉ๋‹ˆ๋‹ค | (์—†์Œ) |
| `summarize_history` | ์ด์ฒด ๋‚ด์—ญ์„ ์š”์•ฝํ•ฉ๋‹ˆ๋‹ค | `period` |
| `confirm_transfer` | ๋Œ€๊ธฐ ์ค‘์ธ ์ด์ฒด๋ฅผ ํ™•์ธ/์ทจ์†Œ/์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค | `action` |
---
## Intended Use
### Primary Use Cases
- **๋ธŒ๋ผ์šฐ์ € ๊ธฐ๋ฐ˜ ์ด์ฒด ๋ฐ๋ชจ**: URL ์ ‘์†๋งŒ์œผ๋กœ ์ž์—ฐ์–ด ์ด์ฒด ๊ธฐ๋Šฅ์„ ์ฒดํ—˜
- **ํฌํŠธํด๋ฆฌ์˜ค ๋ฐ๋ชจ**: ๊ธฐ์ˆ  ๋ฉด์ ‘๊ด€/์ฑ„์šฉ ๋‹ด๋‹น์ž์—๊ฒŒ ์˜จ๋””๋ฐ”์ด์Šค AI ์—ญ๋Ÿ‰ ์‹œ์—ฐ
- **์˜จ๋””๋ฐ”์ด์Šค AI ๋ ˆํผ๋Ÿฐ์Šค**: FunctionGemma fine-tuning + ๋ธŒ๋ผ์šฐ์ € ๋ฐฐํฌ ํŒŒ์ดํ”„๋ผ์ธ ์ฐธ๊ณ 
### Out-of-Scope Use
- ์‹ค์ œ ๊ธˆ์œต ๊ฑฐ๋ž˜์— ์‚ฌ์šฉํ•˜๋ฉด ์•ˆ ๋ฉ๋‹ˆ๋‹ค (์ด ๋ชจ๋ธ์€ ๋ฐ๋ชจ ์ „์šฉ์ž…๋‹ˆ๋‹ค)
- ์ด์ฒด ์™ธ์˜ ๊ธˆ์œต ์—…๋ฌด(๋Œ€์ถœ, ํˆฌ์ž, ๋ณดํ—˜ ๋“ฑ)์—๋Š” ํ•™์Šต๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค
- ์˜์–ด ๋“ฑ ํ•œ๊ตญ์–ด ์™ธ ์–ธ์–ด ์ž…๋ ฅ์€ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค
---
## Training Data
### Seed Data
๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ 5~10๊ฐœ, ์ด ์•ฝ 50~80๊ฐœ์˜ ์‹œ๋“œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ž‘์—…์œผ๋กœ ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.
| ์นดํ…Œ๊ณ ๋ฆฌ | ๋ชฉํ‘œ ์ƒ˜ํ”Œ ์ˆ˜ | ์„ค๋ช… |
|----------|-------------|------|
| transfer_simple | 150 | ๊ธฐ๋ณธ ์ด์ฒด ("์—„๋งˆํ•œํ…Œ 5๋งŒ์› ๋ณด๋‚ด์ค˜") |
| transfer_complex | 100 | ๋ฉ”๋ชจ ํฌํ•จ, ๋ณตํ•ฉ ์š”์ฒญ |
| confirm_cancel_modify | 80 | ํ™•์ธ/์ทจ์†Œ/์ˆ˜์ • ๋ฉ€ํ‹ฐํ„ด |
| clarify | 80 | ์ •๋ณด ๋ถ€์กฑ ์‹œ ์ž์—ฐ์–ด ๋˜๋ฌผ์Œ |
| query_history | 80 | ๋‚ด์—ญ ์กฐํšŒ (๊ธฐ๊ฐ„/์ˆ˜์ทจ์ธ ํ•„ํ„ฐ) |
| summarize | 60 | ๊ธฐ๊ฐ„๋ณ„ ์ด์ฒด ์š”์•ฝ |
| alias_diversity | 100 | ๋ณ„๋ช… ๋ณ€ํ˜• (์—„๋งˆ/์–ด๋จธ๋‹ˆ/๋ง˜) |
| amount_parsing | 100 | ํ•œ๊ตญ์–ด ๊ธˆ์•ก (์˜ค๋งŒ์›/5๋งŒ/์‚ผ๋ฐฑ๋งŒ) |
| rejection | 50 | ์ด์ฒด ์™ธ ์š”์ฒญ ๊ฑฐ์ ˆ |
| edge_cases | 50 | ์˜คํƒ€, ๋น„๋ฌธ, ํ˜ผํ•ฉ ์š”์ฒญ |
### Data Augmentation
์‹œ๋“œ ๋ฐ์ดํ„ฐ๋ฅผ Claude API๋กœ ์ฆ๊ฐ•ํ•˜์—ฌ 500~1,000๊ฐœ ํ•™์Šต ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. ์ฆ๊ฐ• ์‹œ ๋‹ค์Œ์„ ๋ณ€ํ˜•ํ•ฉ๋‹ˆ๋‹ค:
- ๋งํˆฌ: ์กด๋Œ“๋ง/๋ฐ˜๋ง/์ค„์ž„๋ง
- ์˜คํƒ€ ๋ฐ ๋น„๋ฌธ
- ๊ธˆ์•ก ํ‘œํ˜„ ๋ฐฉ์‹: ์ˆœํ•œ๊ธ€, ์ˆซ์ž+ํ•œ์ž, ํ˜ผํ•ฉ, ์•„๋ผ๋น„์•„ ์ˆซ์ž
- ๋ณ„๋ช… ๋ณ€ํ˜•
### Data Format
FunctionGemma chat template์„ ์ค€์ˆ˜ํ•ฉ๋‹ˆ๋‹ค:
```jsonl
{
"messages": [
{
"role": "developer",
"content": "You are a model that can do function calling with the following functions",
"tool_definitions": [...]
},
{
"role": "user",
"content": "์—„๋งˆํ•œํ…Œ ์˜ค๋งŒ์› ๋ณด๋‚ด"
},
{
"role": "assistant",
"content": "",
"function_calls": [
{"name": "execute_transfer", "arguments": {"recipient": "์—„๋งˆ", "amount": 50000}}
]
}
]
}
```
### Validation
๋ชจ๋“  ๋ฐ์ดํ„ฐ๋Š” ์ž๋™ ๊ฒ€์ฆ์„ ๊ฑฐ์นฉ๋‹ˆ๋‹ค:
- JSON schema ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ
- function name์ด ์ •์˜๋œ 4๊ฐœ ์ค‘ ํ•˜๋‚˜์ธ์ง€ ํ™•์ธ
- amount๊ฐ€ ์–‘์˜ ์ •์ˆ˜์ธ์ง€ ํ™•์ธ
- ํ•œ๊ตญ์–ด ๊ธˆ์•ก -> ์ˆซ์ž ๋ณ€ํ™˜ ์ •ํ™•์„ฑ spot check
---
## Training Procedure
### Base Model
- **๋ชจ๋ธ**: google/functiongemma-270m-it
- **ํ•™์Šต ๋ฐฉ์‹**: Full fine-tuning (๋ชจ๋ธ์ด ๊ฒฝ๋Ÿ‰์ด๋ฏ€๋กœ LoRA ์—†์ด ์ „์ฒด ํŒŒ๋ผ๋ฏธํ„ฐ ํ•™์Šต)
### Hyperparameters
| ํŒŒ๋ผ๋ฏธํ„ฐ | ๊ฐ’ |
|----------|-----|
| Epochs | 5 |
| Batch Size (per device) | 8 |
| Learning Rate | 5e-5 |
| LR Scheduler | cosine |
| Warmup Ratio | 0.1 |
| Weight Decay | 0.01 |
| Max Sequence Length | 2048 |
| Precision | bfloat16 |
| Eval Strategy | epoch |
| Save Strategy | epoch |
| Metric for Best Model | eval_loss |
### Training Environment
- **ํ•˜๋“œ์›จ์–ด**: WSL (RAM 128GB / RTX 3070)
- **์†Œํ”„ํŠธ์›จ์–ด**: HuggingFace Transformers + TRL (SFTTrainer)
### Quantization
```bash
# Fine-tuned ๋ชจ๋ธ -> ONNX ๋ณ€ํ™˜ + INT8 ์–‘์žํ™”
python ml/scripts/convert_onnx.py
```
- ONNX ๋ณ€ํ™˜: optimum (`optimum.exporters.onnx`)
- INT8 ๋™์  ์–‘์žํ™”: onnxruntime (`quantize_dynamic`, `QuantType.QInt8`)
- ์ตœ์ข… ๋ชจ๋ธ ํฌ๊ธฐ: **418MB** (ONNX INT8)
- ์ฐธ๊ณ : INT4๋Š” onnxruntime๊ณผ Gemma weight layout ๋น„ํ˜ธํ™˜์œผ๋กœ INT8 ์‚ฌ์šฉ
---
## Evaluation Results
### Base Model vs Fine-tuned Model ๋น„๊ต
์‹œ๋“œ ๋ฐ์ดํ„ฐ 50๊ฐœ ๊ธฐ์ค€ (ํ•™์Šต ๋ฐ์ดํ„ฐ์— ํฌํ•จ๋˜์ง€ ์•Š์€ ์›๋ณธ ์‹œ๋“œ)
| ๋ฉ”ํŠธ๋ฆญ | Base (functiongemma-270m-it) | Fine-tuned | ๊ฐœ์„ ์œจ |
|--------|------------------------------|------------|--------|
| Intent Accuracy | 0.0% | 88.9% | +88.9%p |
| JSON Validity | 10.0% | 100.0% | +90.0%p |
| Amount Parsing | 0.0% | 100.0% | +100.0%p |
| Argument F1 (macro) | 0.0% | 57.5% | +57.5%p |
| Rejection Accuracy | 100.0% | 100.0% | +0.0%p |
#### Argument F1 ์„ธ๋ถ€
| ํ•„๋“œ | F1 |
|------|-----|
| recipient | 83.8% |
| amount | 86.1% |
| memo | 75.0% |
| period | 100.0% |
| action | 0.0% |
| new_amount | 0.0% |
> `action`/`new_amount`๋Š” `confirm_transfer` ํ•จ์ˆ˜ ์ „์šฉ ์ธ์ž๋กœ, ์‹œ๋“œ ๋ฐ์ดํ„ฐ ๋‚ด ํ•ด๋‹น ์˜ˆ์ œ ๋ถ€์กฑ์ด ์›์ธ์ž…๋‹ˆ๋‹ค.
---
## Limitations
- **๋„๋ฉ”์ธ ์ œํ•œ**: ์ด์ฒด ๊ด€๋ จ ๋ช…๋ น๋งŒ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ์™ธ ๊ธˆ์œต ์—…๋ฌด(๋Œ€์ถœ, ํˆฌ์ž ๋“ฑ)๋‚˜ ์ผ๋ฐ˜ ๋Œ€ํ™”์—๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
- **์–ธ์–ด ์ œํ•œ**: ํ•œ๊ตญ์–ด ์ž…๋ ฅ๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
- **๋ธŒ๋ผ์šฐ์ € ์ œํ•œ**: WebGPU ์ง€์› ๋ธŒ๋ผ์šฐ์ €(Chrome 113+, Edge 113+)์—์„œ๋งŒ ์ •์ƒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
- **๋ฐ๋ชจ ์ „์šฉ**: Mock Banking Engine์œผ๋กœ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๋งŒ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, ์‹ค์ œ ๊ธˆ์œต ๊ฑฐ๋ž˜๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
- **ํ•™์Šต ๋ฐ์ดํ„ฐ ํŽธํ–ฅ**: ์‹œ๋“œ ๋ฐ์ดํ„ฐ์™€ Claude API ์ฆ๊ฐ• ๊ธฐ๋ฐ˜์ด๋ฏ€๋กœ, ์‹ค์ œ ์‚ฌ์šฉ์ž ๋ฐœํ™” ํŒจํ„ด๊ณผ ์ฐจ์ด๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- **๋ณตํ•ฉ ๋ช…๋ น ์ œํ•œ**: "์—„๋งˆํ•œํ…Œ 5๋งŒ์›, ์•„๋น ํ•œํ…Œ 3๋งŒ์› ๋ณด๋‚ด์ค˜" ๊ฐ™์€ ๋ณตํ•ฉ ์ด์ฒด ๋ช…๋ น์€ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
---
## How to Use
### With Transformers.js (Browser)
```javascript
import { pipeline } from '@xenova/transformers';
// ๋ชจ๋ธ ๋กœ๋“œ (WebGPU ์ž๋™ ๊ฐ์ง€)
const generator = await pipeline(
'text-generation',
'your-username/transfer-function-gemma-onnx-int4',
{ device: 'webgpu' }
);
// ์ถ”๋ก 
const messages = [
{
role: 'system',
content: 'You are a model that can do function calling with the following functions: [execute_transfer, query_history, summarize_history, confirm_transfer]'
},
{
role: 'user',
content: '์—„๋งˆํ•œํ…Œ 5๋งŒ์› ๋ณด๋‚ด์ค˜'
}
];
const output = await generator(messages, {
max_new_tokens: 256,
temperature: 0.1,
});
console.log(output);
// => {"name": "execute_transfer", "arguments": {"recipient": "์—„๋งˆ", "amount": 50000}}
```
### With Python (HuggingFace Transformers)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"your-username/transfer-function-gemma",
torch_dtype="bfloat16",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"your-username/transfer-function-gemma"
)
messages = [
{"role": "user", "content": "์—„๋งˆํ•œํ…Œ 5๋งŒ์› ๋ณด๋‚ด์ค˜"}
]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.1)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
```
---
## Citation
```bibtex
@misc{transfer-function-gemma-2026,
title={TransferFunctionGemma: On-Device Korean Banking Function Calling},
author={Kimin Ryu},
year={2026},
url={https://github.com/your-username/TransferFunctionGemma}
}
```
---
## Acknowledgments
- [Google Gemma](https://ai.google.dev/gemma) -- Base model
- [HuggingFace Transformers](https://huggingface.co/docs/transformers) -- Training framework
- [Transformers.js](https://huggingface.co/docs/transformers.js) -- Browser inference
- [Anthropic Claude](https://anthropic.com) -- Data augmentation