| --- |
| language: ko |
| license: mit |
| tags: |
| - function-calling |
| - korean |
| - banking |
| - on-device |
| - onnx |
| - int8 |
| - webgpu |
| base_model: google/functiongemma-270m-it |
| --- |
| |
| # TransferFunctionGemma |
|
|
| ํ๊ตญ์ด ์์ฐ์ด ์ด์ฒด ๋ช
๋ น์ ๊ตฌ์กฐํ๋ function call๋ก ๋ณํํ๋ ๊ฒฝ๋ ๋ชจ๋ธ์
๋๋ค. |
|
|
| ## Model Description |
|
|
| TransferFunctionGemma๋ [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)๋ฅผ ํ๊ตญ์ด ๊ธ์ต ์ด์ฒด ๋๋ฉ์ธ์ ๋ง๊ฒ full fine-tuningํ ๋ชจ๋ธ์
๋๋ค. ์์ฐ์ด ์ด์ฒด ๋ช
๋ น์ ๋ถ์ํ์ฌ 4์ข
๋ฅ์ function call JSON์ผ๋ก ๋ณํํฉ๋๋ค. |
|
|
| ONNX INT8 ์์ํ๋ฅผ ํตํด ์ฝ 418MB๋ก ๊ฒฝ๋ํ๋์์ผ๋ฉฐ, Transformers.js + WebGPU๋ฅผ ํตํด ๋ธ๋ผ์ฐ์ ์์ ์ง์ ์ถ๋ก ํ ์ ์์ต๋๋ค. ์๋ฒ ํต์ ์์ด 100% ํด๋ผ์ด์ธํธ ์ฌ์ด๋์์ ๋์ํฉ๋๋ค. |
|
|
| ### Supported Functions |
|
|
| | Function | ์ค๋ช
| ํ์ ์ธ์ | |
| |----------|------|-----------| |
| | `execute_transfer` | ์์ทจ์ธ์๊ฒ ๊ธ์ก์ ์ด์ฒดํฉ๋๋ค | `recipient`, `amount` | |
| | `query_history` | ์ด์ฒด ๋ด์ญ์ ์กฐํํฉ๋๋ค | (์์) | |
| | `summarize_history` | ์ด์ฒด ๋ด์ญ์ ์์ฝํฉ๋๋ค | `period` | |
| | `confirm_transfer` | ๋๊ธฐ ์ค์ธ ์ด์ฒด๋ฅผ ํ์ธ/์ทจ์/์์ ํฉ๋๋ค | `action` | |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| ### Primary Use Cases |
|
|
| - **๋ธ๋ผ์ฐ์ ๊ธฐ๋ฐ ์ด์ฒด ๋ฐ๋ชจ**: URL ์ ์๋ง์ผ๋ก ์์ฐ์ด ์ด์ฒด ๊ธฐ๋ฅ์ ์ฒดํ |
| - **ํฌํธํด๋ฆฌ์ค ๋ฐ๋ชจ**: ๊ธฐ์ ๋ฉด์ ๊ด/์ฑ์ฉ ๋ด๋น์์๊ฒ ์จ๋๋ฐ์ด์ค AI ์ญ๋ ์์ฐ |
| - **์จ๋๋ฐ์ด์ค AI ๋ ํผ๋ฐ์ค**: FunctionGemma fine-tuning + ๋ธ๋ผ์ฐ์ ๋ฐฐํฌ ํ์ดํ๋ผ์ธ ์ฐธ๊ณ |
|
|
| ### Out-of-Scope Use |
|
|
| - ์ค์ ๊ธ์ต ๊ฑฐ๋์ ์ฌ์ฉํ๋ฉด ์ ๋ฉ๋๋ค (์ด ๋ชจ๋ธ์ ๋ฐ๋ชจ ์ ์ฉ์
๋๋ค) |
| - ์ด์ฒด ์ธ์ ๊ธ์ต ์
๋ฌด(๋์ถ, ํฌ์, ๋ณดํ ๋ฑ)์๋ ํ์ต๋์ง ์์์ต๋๋ค |
| - ์์ด ๋ฑ ํ๊ตญ์ด ์ธ ์ธ์ด ์
๋ ฅ์ ์ง์ํ์ง ์์ต๋๋ค |
|
|
| --- |
|
|
| ## Training Data |
|
|
| ### Seed Data |
|
|
| ๊ฐ ์นดํ
๊ณ ๋ฆฌ๋ณ 5~10๊ฐ, ์ด ์ฝ 50~80๊ฐ์ ์๋ ๋ฐ์ดํฐ๋ฅผ ์์์
์ผ๋ก ์์ฑํ์ต๋๋ค. |
|
|
| | ์นดํ
๊ณ ๋ฆฌ | ๋ชฉํ ์ํ ์ | ์ค๋ช
| |
| |----------|-------------|------| |
| | transfer_simple | 150 | ๊ธฐ๋ณธ ์ด์ฒด ("์๋งํํ
5๋ง์ ๋ณด๋ด์ค") | |
| | transfer_complex | 100 | ๋ฉ๋ชจ ํฌํจ, ๋ณตํฉ ์์ฒญ | |
| | confirm_cancel_modify | 80 | ํ์ธ/์ทจ์/์์ ๋ฉํฐํด | |
| | clarify | 80 | ์ ๋ณด ๋ถ์กฑ ์ ์์ฐ์ด ๋๋ฌผ์ | |
| | query_history | 80 | ๋ด์ญ ์กฐํ (๊ธฐ๊ฐ/์์ทจ์ธ ํํฐ) | |
| | summarize | 60 | ๊ธฐ๊ฐ๋ณ ์ด์ฒด ์์ฝ | |
| | alias_diversity | 100 | ๋ณ๋ช
๋ณํ (์๋ง/์ด๋จธ๋/๋ง) | |
| | amount_parsing | 100 | ํ๊ตญ์ด ๊ธ์ก (์ค๋ง์/5๋ง/์ผ๋ฐฑ๋ง) | |
| | rejection | 50 | ์ด์ฒด ์ธ ์์ฒญ ๊ฑฐ์ | |
| | edge_cases | 50 | ์คํ, ๋น๋ฌธ, ํผํฉ ์์ฒญ | |
|
|
| ### Data Augmentation |
|
|
| ์๋ ๋ฐ์ดํฐ๋ฅผ Claude API๋ก ์ฆ๊ฐํ์ฌ 500~1,000๊ฐ ํ์ต ์ํ์ ์์ฑํ์ต๋๋ค. ์ฆ๊ฐ ์ ๋ค์์ ๋ณํํฉ๋๋ค: |
|
|
| - ๋งํฌ: ์กด๋๋ง/๋ฐ๋ง/์ค์๋ง |
| - ์คํ ๋ฐ ๋น๋ฌธ |
| - ๊ธ์ก ํํ ๋ฐฉ์: ์ํ๊ธ, ์ซ์+ํ์, ํผํฉ, ์๋ผ๋น์ ์ซ์ |
| - ๋ณ๋ช
๋ณํ |
|
|
| ### Data Format |
|
|
| FunctionGemma chat template์ ์ค์ํฉ๋๋ค: |
|
|
| ```jsonl |
| { |
| "messages": [ |
| { |
| "role": "developer", |
| "content": "You are a model that can do function calling with the following functions", |
| "tool_definitions": [...] |
| }, |
| { |
| "role": "user", |
| "content": "์๋งํํ
์ค๋ง์ ๋ณด๋ด" |
| }, |
| { |
| "role": "assistant", |
| "content": "", |
| "function_calls": [ |
| {"name": "execute_transfer", "arguments": {"recipient": "์๋ง", "amount": 50000}} |
| ] |
| } |
| ] |
| } |
| ``` |
|
|
| ### Validation |
|
|
| ๋ชจ๋ ๋ฐ์ดํฐ๋ ์๋ ๊ฒ์ฆ์ ๊ฑฐ์นฉ๋๋ค: |
|
|
| - JSON schema ์ ํจ์ฑ ๊ฒ์ฌ |
| - function name์ด ์ ์๋ 4๊ฐ ์ค ํ๋์ธ์ง ํ์ธ |
| - amount๊ฐ ์์ ์ ์์ธ์ง ํ์ธ |
| - ํ๊ตญ์ด ๊ธ์ก -> ์ซ์ ๋ณํ ์ ํ์ฑ spot check |
|
|
| --- |
|
|
| ## Training Procedure |
|
|
| ### Base Model |
|
|
| - **๋ชจ๋ธ**: google/functiongemma-270m-it |
| - **ํ์ต ๋ฐฉ์**: Full fine-tuning (๋ชจ๋ธ์ด ๊ฒฝ๋์ด๋ฏ๋ก LoRA ์์ด ์ ์ฒด ํ๋ผ๋ฏธํฐ ํ์ต) |
|
|
| ### Hyperparameters |
|
|
| | ํ๋ผ๋ฏธํฐ | ๊ฐ | |
| |----------|-----| |
| | Epochs | 5 | |
| | Batch Size (per device) | 8 | |
| | Learning Rate | 5e-5 | |
| | LR Scheduler | cosine | |
| | Warmup Ratio | 0.1 | |
| | Weight Decay | 0.01 | |
| | Max Sequence Length | 2048 | |
| | Precision | bfloat16 | |
| | Eval Strategy | epoch | |
| | Save Strategy | epoch | |
| | Metric for Best Model | eval_loss | |
| |
| ### Training Environment |
| |
| - **ํ๋์จ์ด**: WSL (RAM 128GB / RTX 3070) |
| - **์ํํธ์จ์ด**: HuggingFace Transformers + TRL (SFTTrainer) |
| |
| ### Quantization |
| |
| ```bash |
| # Fine-tuned ๋ชจ๋ธ -> ONNX ๋ณํ + INT8 ์์ํ |
| python ml/scripts/convert_onnx.py |
| ``` |
| |
| - ONNX ๋ณํ: optimum (`optimum.exporters.onnx`) |
| - INT8 ๋์ ์์ํ: onnxruntime (`quantize_dynamic`, `QuantType.QInt8`) |
| - ์ต์ข
๋ชจ๋ธ ํฌ๊ธฐ: **418MB** (ONNX INT8) |
| - ์ฐธ๊ณ : INT4๋ onnxruntime๊ณผ Gemma weight layout ๋นํธํ์ผ๋ก INT8 ์ฌ์ฉ |
| |
| --- |
| |
| ## Evaluation Results |
| |
| ### Base Model vs Fine-tuned Model ๋น๊ต |
| |
| ์๋ ๋ฐ์ดํฐ 50๊ฐ ๊ธฐ์ค (ํ์ต ๋ฐ์ดํฐ์ ํฌํจ๋์ง ์์ ์๋ณธ ์๋) |
| |
| | ๋ฉํธ๋ฆญ | Base (functiongemma-270m-it) | Fine-tuned | ๊ฐ์ ์จ | |
| |--------|------------------------------|------------|--------| |
| | Intent Accuracy | 0.0% | 88.9% | +88.9%p | |
| | JSON Validity | 10.0% | 100.0% | +90.0%p | |
| | Amount Parsing | 0.0% | 100.0% | +100.0%p | |
| | Argument F1 (macro) | 0.0% | 57.5% | +57.5%p | |
| | Rejection Accuracy | 100.0% | 100.0% | +0.0%p | |
| |
| #### Argument F1 ์ธ๋ถ |
| |
| | ํ๋ | F1 | |
| |------|-----| |
| | recipient | 83.8% | |
| | amount | 86.1% | |
| | memo | 75.0% | |
| | period | 100.0% | |
| | action | 0.0% | |
| | new_amount | 0.0% | |
| |
| > `action`/`new_amount`๋ `confirm_transfer` ํจ์ ์ ์ฉ ์ธ์๋ก, ์๋ ๋ฐ์ดํฐ ๋ด ํด๋น ์์ ๋ถ์กฑ์ด ์์ธ์
๋๋ค. |
| |
| --- |
| |
| ## Limitations |
| |
| - **๋๋ฉ์ธ ์ ํ**: ์ด์ฒด ๊ด๋ จ ๋ช
๋ น๋ง ์ฒ๋ฆฌ ๊ฐ๋ฅํฉ๋๋ค. ๊ทธ ์ธ ๊ธ์ต ์
๋ฌด(๋์ถ, ํฌ์ ๋ฑ)๋ ์ผ๋ฐ ๋ํ์๋ ์ ํฉํ์ง ์์ต๋๋ค. |
| - **์ธ์ด ์ ํ**: ํ๊ตญ์ด ์
๋ ฅ๋ง ์ง์ํฉ๋๋ค. |
| - **๋ธ๋ผ์ฐ์ ์ ํ**: WebGPU ์ง์ ๋ธ๋ผ์ฐ์ (Chrome 113+, Edge 113+)์์๋ง ์ ์ ๋์ํฉ๋๋ค. |
| - **๋ฐ๋ชจ ์ ์ฉ**: Mock Banking Engine์ผ๋ก ์๋ฎฌ๋ ์ด์
๋ง ์ํํ๋ฉฐ, ์ค์ ๊ธ์ต ๊ฑฐ๋๋ฅผ ์ํํ์ง ์์ต๋๋ค. |
| - **ํ์ต ๋ฐ์ดํฐ ํธํฅ**: ์๋ ๋ฐ์ดํฐ์ Claude API ์ฆ๊ฐ ๊ธฐ๋ฐ์ด๋ฏ๋ก, ์ค์ ์ฌ์ฉ์ ๋ฐํ ํจํด๊ณผ ์ฐจ์ด๊ฐ ์์ ์ ์์ต๋๋ค. |
| - **๋ณตํฉ ๋ช
๋ น ์ ํ**: "์๋งํํ
5๋ง์, ์๋น ํํ
3๋ง์ ๋ณด๋ด์ค" ๊ฐ์ ๋ณตํฉ ์ด์ฒด ๋ช
๋ น์ ์ง์ํ์ง ์์ต๋๋ค. |
| |
| --- |
| |
| ## How to Use |
| |
| ### With Transformers.js (Browser) |
| |
| ```javascript |
| import { pipeline } from '@xenova/transformers'; |
|
|
| // ๋ชจ๋ธ ๋ก๋ (WebGPU ์๋ ๊ฐ์ง) |
| const generator = await pipeline( |
| 'text-generation', |
| 'your-username/transfer-function-gemma-onnx-int4', |
| { device: 'webgpu' } |
| ); |
|
|
| // ์ถ๋ก |
| const messages = [ |
| { |
| role: 'system', |
| content: 'You are a model that can do function calling with the following functions: [execute_transfer, query_history, summarize_history, confirm_transfer]' |
| }, |
| { |
| role: 'user', |
| content: '์๋งํํ
5๋ง์ ๋ณด๋ด์ค' |
| } |
| ]; |
| |
| const output = await generator(messages, { |
| max_new_tokens: 256, |
| temperature: 0.1, |
| }); |
|
|
| console.log(output); |
| // => {"name": "execute_transfer", "arguments": {"recipient": "์๋ง", "amount": 50000}} |
| ``` |
| |
| ### With Python (HuggingFace Transformers) |
| |
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "your-username/transfer-function-gemma", |
| torch_dtype="bfloat16", |
| device_map="auto" |
| ) |
| tokenizer = AutoTokenizer.from_pretrained( |
| "your-username/transfer-function-gemma" |
| ) |
| |
| messages = [ |
| {"role": "user", "content": "์๋งํํ
5๋ง์ ๋ณด๋ด์ค"} |
| ] |
| |
| inputs = tokenizer.apply_chat_template( |
| messages, |
| return_tensors="pt", |
| add_generation_prompt=True |
| ).to(model.device) |
| |
| outputs = model.generate(inputs, max_new_tokens=256, temperature=0.1) |
| result = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| print(result) |
| ``` |
| |
| --- |
| |
| ## Citation |
| |
| ```bibtex |
| @misc{transfer-function-gemma-2026, |
| title={TransferFunctionGemma: On-Device Korean Banking Function Calling}, |
| author={Kimin Ryu}, |
| year={2026}, |
| url={https://github.com/your-username/TransferFunctionGemma} |
| } |
| ``` |
| |
| --- |
| |
| ## Acknowledgments |
| |
| - [Google Gemma](https://ai.google.dev/gemma) -- Base model |
| - [HuggingFace Transformers](https://huggingface.co/docs/transformers) -- Training framework |
| - [Transformers.js](https://huggingface.co/docs/transformers.js) -- Browser inference |
| - [Anthropic Claude](https://anthropic.com) -- Data augmentation |
| |