File size: 3,629 Bytes

b7e6c9d
 
 
 
b73205a
 
 
b7e6c9d
b73205a
 
b7e6c9d
 
 
 
 
 
 
 
 
 
 
 
 
b73205a
b7e6c9d
 
 
 
 
 
 
 
 
b73205a
b7e6c9d
 
 
 
 
 
 
 
 
b73205a
 
b7e6c9d
 
 
 
b73205a
 
b7e6c9d
b73205a
 
 
 
 
b7e6c9d
 
 
 
b73205a
b7e6c9d
 
 
b73205a
b7e6c9d
 
b73205a
 
b7e6c9d
 
b73205a
b7e6c9d
b73205a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b7e6c9d
 
 
 
 
 
 
 
 
 
 
 
 
b73205a

---
language: ko
license: apache-2.0
tags:
- function-calling
- korean
- hybridko
base_model: Yaongi/hybridko-exp6
datasets:
- heegyu/glaive-function-calling-v2-ko
---

# HybriKo-117M Function Calling

HybriKo-117M (checkpoint 1962) 모델을 Function Calling 데이터로 미세조정한 모델입니다.

## 학습 정보
- **Base Model**: Yaongi/hybridko-exp6
- **Dataset**: heegyu/glaive-function-calling-v2-ko (5,000 samples)
- **Epochs**: 2
- **Final Loss**: ~0.14
- **Performance**: 기본 포맷 학습 완료 (Calculation, Search, Weather 등 지원)

## 사용법 (Colab)

```python
import torch
import torch.nn.functional as F
import sentencepiece as spm
from transformers import AutoModelForCausalLM
from huggingface_hub import hf_hub_download

# 1. 모델 로드
print("📥 Model loading...")
model = AutoModelForCausalLM.from_pretrained(
    "Yaongi/HybriKo-117M-Exp6-FunctionCall",
    trust_remote_code=True,
    torch_dtype=torch.float32
)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
model.eval()

# 2. 토크나이저 로드
print("📥 Tokenizer loading...")
sp_path = hf_hub_download("Yaongi/HybriKo-117M-Exp6-FunctionCall", "HybriKo_tok.model")
sp = spm.SentencePieceProcessor()
sp.Load(sp_path)

# 3. 생성 함수 (Stop Logic 포함)
def generate(text, max_len=200, temp=0.01, top_k=1):
    input_ids = torch.tensor([[sp.bos_id()] + sp.EncodeAsIds(text)]).to(device)
    
    # 중지 텍스트 리스트
    stop_sequences = ["<|im_end|>", "</tool_code>"]
    
    print("🤖 Generating...", end="", flush=True)
    with torch.no_grad():
        for _ in range(max_len):
            outputs = model(input_ids[:, -512:])
            logits = outputs.logits[:, -1] / temp
            
            if top_k:
                v, _ = torch.topk(logits, min(top_k, logits.size(-1)))
                logits[logits < v[:, [-1]]] = float("-inf")
            
            probs = F.softmax(logits, dim=-1)
            next_token = torch.multinomial(probs, 1)
            
            # EOS 토큰 체크
            if next_token.item() == sp.eos_id():
                break
            
            input_ids = torch.cat([input_ids, next_token], dim=1)
            
            # 💡 Stop Sequence 체크 (매 스텝 디코딩하여 확인)
            curr_text = sp.DecodeIds(input_ids[0].tolist())
            
            # 프롬프트 이후 생성된 부분만 잘라서 확인
            # (SentencePiece 특성상 정확한 슬라이싱을 위해 전체 디코딩 후 비교가 안전)
            gen_part = curr_text[len(text):] # 근사적인 방법
            
            # 정확도를 위해 full text에서 검색
            should_stop = False
            for seq in stop_sequences:
                if seq in curr_text and not (seq in text): # 프롬프트에 이미 있는 경우는 제외
                     # 방금 생성된 부분에 토큰이 완성되었는지 확인
                     should_stop = True
                     break
            
            if should_stop:
                break
                
    return sp.DecodeIds(input_ids[0].tolist())

# 4. 실행 예시
prompt = '''<|im_start|>system
당신은 도구 호출(function calling)이 가능한 AI 어시스턴트입니다.
<tools>
{"name": "get_news_headlines", "parameters": {"country": "string"}}
</tools><|im_end|>
<|im_start|>user
한국의 최신 뉴스 알려줘<|im_end|>
<|im_start|>assistant
'''

print("\nPrompt:")
print(prompt)

result = generate(prompt, max_len=200)

# 출력 깔끔하게 정리
print("\n" + "="*50)
print("Result:")
print(result)
print("="*50)

'''