HybriKo-117M-LinuxFC-SFT-v2

Korean Hybrid LLM fine-tuned for Linux Command Function Calling.

🚀 20초 만에 Colab에서 실행 가능! GPU T4에서 바로 테스트해보세요.

Architecture

Griffin-style Hybrid Architecture - RNN과 Attention을 2:1 비율로 결합하여 효율성과 성능을 동시에 확보합니다.

Component	Description
Griffin Block	RG-LRU (Real-Gated Linear Recurrent Unit) + GeGLU FFN
Attention Block	GQA (Grouped Query Attention) + RoPE + GeGLU FFN
Pattern	Griffin → Griffin → Attention (2:1 반복)
Total Layers	12 (Griffin 8 + Attention 4)

Model Details

Spec	Value
Parameters	117.8M
d_model	768
n_layers	12
n_heads	12
n_kv_heads	3
max_seq_len	6,144
vocab_size	32,000

Training Results

클로드와 제미나이로 5,250개 학습 샘플을 생성하여 대폭 향상된 성능을 달성했습니다.

Metric	Before (250 samples)	After (5K samples)
Train Samples	250	4,725
Eval Loss	0.039	0.0039
Action Name Accuracy	4% (2/50)	100% (100/100)

결론: 데이터 증가 (250 → 5,000)로 4% → 100% Action Name 정확도 달성!

⚠️ Known Limitations

Action Name: 100% 정확 (올바른 명령어 선택)
Parameters: 일부 hallucination 발생 가능
- 파일명/경로가 다르게 생성될 수 있음 (예: test.txt → test.py)
- Thought가 의도와 다를 수 있음

117M 파라미터의 소형 모델 한계입니다. 더 정확한 파라미터 생성을 위해서는 더 큰 모델과 더 많은 데이터가 필요합니다.

Supported Commands (21)

ls, cd, mkdir, rm, cp, mv, find, cat, grep, head, 
tail, wc, ps, df, du, top, ping, curl, chmod, tar, Finish

Quick Start (Colab)

!pip install -q huggingface_hub sentencepiece
from huggingface_hub import hf_hub_download
hf_hub_download("Yaongi/HybriKo-117M-LinuxFC-SFT-v2", "demo_colab.py", local_dir=".")

!python demo_colab.py --query "디스크 사용량을 확인해줘"
!python demo_colab.py --query "test.txt 파일을 삭제해줘"
!python demo_colab.py --query "backup.tar.gz 압축 풀어줘"
!python demo_colab.py --query "현재 폴더의 파일 목록을 보여줘"

Example Outputs (Actual Results)

Input: 디스크 사용량을 확인해줘
----------------------------------------
Thought: 디스크 용량을 확인합니다.
Action:  df_command
Input:   {"options": "-h --total"}
----------------------------------------

Input: test.txt 파일을 삭제해줘
----------------------------------------
Thought: rm 명령어로 삭제합니다.
Action:  rm_command
Input:   {"path": "test.py", "recursive": false}  ⚠️ 파일명 hallucination
----------------------------------------

Input: backup.tar.gz 압축 풀어줘
----------------------------------------
Thought: 파일들을 묶어서 압축합니다.  ⚠️ Thought 오류 (실제로는 압축 해제)
Action:  tar_command
Input:   {"options": "-xzf", "archive": "data.tar.gz"}  ⚠️ 파일명 hallucination
----------------------------------------

Input: 현재 폴더의 파일 목록을 보여줘
----------------------------------------
Thought: ls 명령어로 파일 목록을 봅니다.
Action:  ls_command
Input:   {"path": "/etc", "options": "-lS"}  ⚠️ 경로 hallucination
----------------------------------------

Note: Action Name (df_command, rm_command, tar_command, ls_command)은 모두 정확합니다.

Training Details

Config	Value
Hardware	A100 x 8 (DDP)
Batch Size	32 (1 × 8 GPUs × 4 grad accum)
Learning Rate	5e-5
Warmup Steps	100
Epochs	15
Base Model	HybriKo-117M (exp7_phase1)

Files

File	Description
`pytorch_model.pt`	Model weights
`HybriKo_tok.model`	SentencePiece tokenizer
`demo_colab.py`	Colab demo script (auto-downloads all files)
`configuration_hybridko.py`	Model config
`modeling_hybridko.py`	Model implementation

License

Apache 2.0

Citation

@misc{hybridko-linuxfc-2026,
  title={HybriKo-117M-LinuxFC-SFT-v2: Korean Hybrid LLM for Linux Function Calling},
  author={Yaongi},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/Yaongi/HybriKo-117M-LinuxFC-SFT-v2}
}

Downloads last month: 1