zeriworkspace's picture
Upload folder using huggingface_hub
6e013ad verified
metadata
language: ko
license: mit
tags:
  - function-calling
  - korean
  - banking
  - on-device
  - onnx
  - int8
  - webgpu
base_model: google/functiongemma-270m-it

TransferFunctionGemma

ํ•œ๊ตญ์–ด ์ž์—ฐ์–ด ์ด์ฒด ๋ช…๋ น์„ ๊ตฌ์กฐํ™”๋œ function call๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒฝ๋Ÿ‰ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

Model Description

TransferFunctionGemma๋Š” google/functiongemma-270m-it๋ฅผ ํ•œ๊ตญ์–ด ๊ธˆ์œต ์ด์ฒด ๋„๋ฉ”์ธ์— ๋งž๊ฒŒ full fine-tuningํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ž์—ฐ์–ด ์ด์ฒด ๋ช…๋ น์„ ๋ถ„์„ํ•˜์—ฌ 4์ข…๋ฅ˜์˜ function call JSON์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

ONNX INT8 ์–‘์žํ™”๋ฅผ ํ†ตํ•ด ์•ฝ 418MB๋กœ ๊ฒฝ๋Ÿ‰ํ™”๋˜์—ˆ์œผ๋ฉฐ, Transformers.js + WebGPU๋ฅผ ํ†ตํ•ด ๋ธŒ๋ผ์šฐ์ €์—์„œ ์ง์ ‘ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์„œ๋ฒ„ ํ†ต์‹  ์—†์ด 100% ํด๋ผ์ด์–ธํŠธ ์‚ฌ์ด๋“œ์—์„œ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.

Supported Functions

Function ์„ค๋ช… ํ•„์ˆ˜ ์ธ์ž
execute_transfer ์ˆ˜์ทจ์ธ์—๊ฒŒ ๊ธˆ์•ก์„ ์ด์ฒดํ•ฉ๋‹ˆ๋‹ค recipient, amount
query_history ์ด์ฒด ๋‚ด์—ญ์„ ์กฐํšŒํ•ฉ๋‹ˆ๋‹ค (์—†์Œ)
summarize_history ์ด์ฒด ๋‚ด์—ญ์„ ์š”์•ฝํ•ฉ๋‹ˆ๋‹ค period
confirm_transfer ๋Œ€๊ธฐ ์ค‘์ธ ์ด์ฒด๋ฅผ ํ™•์ธ/์ทจ์†Œ/์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค action

Intended Use

Primary Use Cases

  • ๋ธŒ๋ผ์šฐ์ € ๊ธฐ๋ฐ˜ ์ด์ฒด ๋ฐ๋ชจ: URL ์ ‘์†๋งŒ์œผ๋กœ ์ž์—ฐ์–ด ์ด์ฒด ๊ธฐ๋Šฅ์„ ์ฒดํ—˜
  • ํฌํŠธํด๋ฆฌ์˜ค ๋ฐ๋ชจ: ๊ธฐ์ˆ  ๋ฉด์ ‘๊ด€/์ฑ„์šฉ ๋‹ด๋‹น์ž์—๊ฒŒ ์˜จ๋””๋ฐ”์ด์Šค AI ์—ญ๋Ÿ‰ ์‹œ์—ฐ
  • ์˜จ๋””๋ฐ”์ด์Šค AI ๋ ˆํผ๋Ÿฐ์Šค: FunctionGemma fine-tuning + ๋ธŒ๋ผ์šฐ์ € ๋ฐฐํฌ ํŒŒ์ดํ”„๋ผ์ธ ์ฐธ๊ณ 

Out-of-Scope Use

  • ์‹ค์ œ ๊ธˆ์œต ๊ฑฐ๋ž˜์— ์‚ฌ์šฉํ•˜๋ฉด ์•ˆ ๋ฉ๋‹ˆ๋‹ค (์ด ๋ชจ๋ธ์€ ๋ฐ๋ชจ ์ „์šฉ์ž…๋‹ˆ๋‹ค)
  • ์ด์ฒด ์™ธ์˜ ๊ธˆ์œต ์—…๋ฌด(๋Œ€์ถœ, ํˆฌ์ž, ๋ณดํ—˜ ๋“ฑ)์—๋Š” ํ•™์Šต๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค
  • ์˜์–ด ๋“ฑ ํ•œ๊ตญ์–ด ์™ธ ์–ธ์–ด ์ž…๋ ฅ์€ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค

Training Data

Seed Data

๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ 510๊ฐœ, ์ด ์•ฝ 5080๊ฐœ์˜ ์‹œ๋“œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ž‘์—…์œผ๋กœ ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

์นดํ…Œ๊ณ ๋ฆฌ ๋ชฉํ‘œ ์ƒ˜ํ”Œ ์ˆ˜ ์„ค๋ช…
transfer_simple 150 ๊ธฐ๋ณธ ์ด์ฒด ("์—„๋งˆํ•œํ…Œ 5๋งŒ์› ๋ณด๋‚ด์ค˜")
transfer_complex 100 ๋ฉ”๋ชจ ํฌํ•จ, ๋ณตํ•ฉ ์š”์ฒญ
confirm_cancel_modify 80 ํ™•์ธ/์ทจ์†Œ/์ˆ˜์ • ๋ฉ€ํ‹ฐํ„ด
clarify 80 ์ •๋ณด ๋ถ€์กฑ ์‹œ ์ž์—ฐ์–ด ๋˜๋ฌผ์Œ
query_history 80 ๋‚ด์—ญ ์กฐํšŒ (๊ธฐ๊ฐ„/์ˆ˜์ทจ์ธ ํ•„ํ„ฐ)
summarize 60 ๊ธฐ๊ฐ„๋ณ„ ์ด์ฒด ์š”์•ฝ
alias_diversity 100 ๋ณ„๋ช… ๋ณ€ํ˜• (์—„๋งˆ/์–ด๋จธ๋‹ˆ/๋ง˜)
amount_parsing 100 ํ•œ๊ตญ์–ด ๊ธˆ์•ก (์˜ค๋งŒ์›/5๋งŒ/์‚ผ๋ฐฑ๋งŒ)
rejection 50 ์ด์ฒด ์™ธ ์š”์ฒญ ๊ฑฐ์ ˆ
edge_cases 50 ์˜คํƒ€, ๋น„๋ฌธ, ํ˜ผํ•ฉ ์š”์ฒญ

Data Augmentation

์‹œ๋“œ ๋ฐ์ดํ„ฐ๋ฅผ Claude API๋กœ ์ฆ๊ฐ•ํ•˜์—ฌ 500~1,000๊ฐœ ํ•™์Šต ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. ์ฆ๊ฐ• ์‹œ ๋‹ค์Œ์„ ๋ณ€ํ˜•ํ•ฉ๋‹ˆ๋‹ค:

  • ๋งํˆฌ: ์กด๋Œ“๋ง/๋ฐ˜๋ง/์ค„์ž„๋ง
  • ์˜คํƒ€ ๋ฐ ๋น„๋ฌธ
  • ๊ธˆ์•ก ํ‘œํ˜„ ๋ฐฉ์‹: ์ˆœํ•œ๊ธ€, ์ˆซ์ž+ํ•œ์ž, ํ˜ผํ•ฉ, ์•„๋ผ๋น„์•„ ์ˆซ์ž
  • ๋ณ„๋ช… ๋ณ€ํ˜•

Data Format

FunctionGemma chat template์„ ์ค€์ˆ˜ํ•ฉ๋‹ˆ๋‹ค:

{
  "messages": [
    {
      "role": "developer",
      "content": "You are a model that can do function calling with the following functions",
      "tool_definitions": [...]
    },
    {
      "role": "user",
      "content": "์—„๋งˆํ•œํ…Œ ์˜ค๋งŒ์› ๋ณด๋‚ด"
    },
    {
      "role": "assistant",
      "content": "",
      "function_calls": [
        {"name": "execute_transfer", "arguments": {"recipient": "์—„๋งˆ", "amount": 50000}}
      ]
    }
  ]
}

Validation

๋ชจ๋“  ๋ฐ์ดํ„ฐ๋Š” ์ž๋™ ๊ฒ€์ฆ์„ ๊ฑฐ์นฉ๋‹ˆ๋‹ค:

  • JSON schema ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ
  • function name์ด ์ •์˜๋œ 4๊ฐœ ์ค‘ ํ•˜๋‚˜์ธ์ง€ ํ™•์ธ
  • amount๊ฐ€ ์–‘์˜ ์ •์ˆ˜์ธ์ง€ ํ™•์ธ
  • ํ•œ๊ตญ์–ด ๊ธˆ์•ก -> ์ˆซ์ž ๋ณ€ํ™˜ ์ •ํ™•์„ฑ spot check

Training Procedure

Base Model

  • ๋ชจ๋ธ: google/functiongemma-270m-it
  • ํ•™์Šต ๋ฐฉ์‹: Full fine-tuning (๋ชจ๋ธ์ด ๊ฒฝ๋Ÿ‰์ด๋ฏ€๋กœ LoRA ์—†์ด ์ „์ฒด ํŒŒ๋ผ๋ฏธํ„ฐ ํ•™์Šต)

Hyperparameters

ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’
Epochs 5
Batch Size (per device) 8
Learning Rate 5e-5
LR Scheduler cosine
Warmup Ratio 0.1
Weight Decay 0.01
Max Sequence Length 2048
Precision bfloat16
Eval Strategy epoch
Save Strategy epoch
Metric for Best Model eval_loss

Training Environment

  • ํ•˜๋“œ์›จ์–ด: WSL (RAM 128GB / RTX 3070)
  • ์†Œํ”„ํŠธ์›จ์–ด: HuggingFace Transformers + TRL (SFTTrainer)

Quantization

# Fine-tuned ๋ชจ๋ธ -> ONNX ๋ณ€ํ™˜ + INT8 ์–‘์žํ™”
python ml/scripts/convert_onnx.py
  • ONNX ๋ณ€ํ™˜: optimum (optimum.exporters.onnx)
  • INT8 ๋™์  ์–‘์žํ™”: onnxruntime (quantize_dynamic, QuantType.QInt8)
  • ์ตœ์ข… ๋ชจ๋ธ ํฌ๊ธฐ: 418MB (ONNX INT8)
  • ์ฐธ๊ณ : INT4๋Š” onnxruntime๊ณผ Gemma weight layout ๋น„ํ˜ธํ™˜์œผ๋กœ INT8 ์‚ฌ์šฉ

Evaluation Results

Base Model vs Fine-tuned Model ๋น„๊ต

์‹œ๋“œ ๋ฐ์ดํ„ฐ 50๊ฐœ ๊ธฐ์ค€ (ํ•™์Šต ๋ฐ์ดํ„ฐ์— ํฌํ•จ๋˜์ง€ ์•Š์€ ์›๋ณธ ์‹œ๋“œ)

๋ฉ”ํŠธ๋ฆญ Base (functiongemma-270m-it) Fine-tuned ๊ฐœ์„ ์œจ
Intent Accuracy 0.0% 88.9% +88.9%p
JSON Validity 10.0% 100.0% +90.0%p
Amount Parsing 0.0% 100.0% +100.0%p
Argument F1 (macro) 0.0% 57.5% +57.5%p
Rejection Accuracy 100.0% 100.0% +0.0%p

Argument F1 ์„ธ๋ถ€

ํ•„๋“œ F1
recipient 83.8%
amount 86.1%
memo 75.0%
period 100.0%
action 0.0%
new_amount 0.0%

action/new_amount๋Š” confirm_transfer ํ•จ์ˆ˜ ์ „์šฉ ์ธ์ž๋กœ, ์‹œ๋“œ ๋ฐ์ดํ„ฐ ๋‚ด ํ•ด๋‹น ์˜ˆ์ œ ๋ถ€์กฑ์ด ์›์ธ์ž…๋‹ˆ๋‹ค.


Limitations

  • ๋„๋ฉ”์ธ ์ œํ•œ: ์ด์ฒด ๊ด€๋ จ ๋ช…๋ น๋งŒ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ์™ธ ๊ธˆ์œต ์—…๋ฌด(๋Œ€์ถœ, ํˆฌ์ž ๋“ฑ)๋‚˜ ์ผ๋ฐ˜ ๋Œ€ํ™”์—๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ์–ธ์–ด ์ œํ•œ: ํ•œ๊ตญ์–ด ์ž…๋ ฅ๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ธŒ๋ผ์šฐ์ € ์ œํ•œ: WebGPU ์ง€์› ๋ธŒ๋ผ์šฐ์ €(Chrome 113+, Edge 113+)์—์„œ๋งŒ ์ •์ƒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฐ๋ชจ ์ „์šฉ: Mock Banking Engine์œผ๋กœ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๋งŒ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, ์‹ค์ œ ๊ธˆ์œต ๊ฑฐ๋ž˜๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต ๋ฐ์ดํ„ฐ ํŽธํ–ฅ: ์‹œ๋“œ ๋ฐ์ดํ„ฐ์™€ Claude API ์ฆ๊ฐ• ๊ธฐ๋ฐ˜์ด๋ฏ€๋กœ, ์‹ค์ œ ์‚ฌ์šฉ์ž ๋ฐœํ™” ํŒจํ„ด๊ณผ ์ฐจ์ด๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ณตํ•ฉ ๋ช…๋ น ์ œํ•œ: "์—„๋งˆํ•œํ…Œ 5๋งŒ์›, ์•„๋น ํ•œํ…Œ 3๋งŒ์› ๋ณด๋‚ด์ค˜" ๊ฐ™์€ ๋ณตํ•ฉ ์ด์ฒด ๋ช…๋ น์€ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

How to Use

With Transformers.js (Browser)

import { pipeline } from '@xenova/transformers';

// ๋ชจ๋ธ ๋กœ๋“œ (WebGPU ์ž๋™ ๊ฐ์ง€)
const generator = await pipeline(
  'text-generation',
  'your-username/transfer-function-gemma-onnx-int4',
  { device: 'webgpu' }
);

// ์ถ”๋ก 
const messages = [
  {
    role: 'system',
    content: 'You are a model that can do function calling with the following functions: [execute_transfer, query_history, summarize_history, confirm_transfer]'
  },
  {
    role: 'user',
    content: '์—„๋งˆํ•œํ…Œ 5๋งŒ์› ๋ณด๋‚ด์ค˜'
  }
];

const output = await generator(messages, {
  max_new_tokens: 256,
  temperature: 0.1,
});

console.log(output);
// => {"name": "execute_transfer", "arguments": {"recipient": "์—„๋งˆ", "amount": 50000}}

With Python (HuggingFace Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "your-username/transfer-function-gemma",
    torch_dtype="bfloat16",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "your-username/transfer-function-gemma"
)

messages = [
    {"role": "user", "content": "์—„๋งˆํ•œํ…Œ 5๋งŒ์› ๋ณด๋‚ด์ค˜"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(inputs, max_new_tokens=256, temperature=0.1)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Citation

@misc{transfer-function-gemma-2026,
  title={TransferFunctionGemma: On-Device Korean Banking Function Calling},
  author={Kimin Ryu},
  year={2026},
  url={https://github.com/your-username/TransferFunctionGemma}
}

Acknowledgments