EXAONE-7.8B-Kimi-Student 🎤

한국어 아이돌 캐릭터 대화에 특화된 EXAONE-7.8B 모델. Kimi K2 Teacher로부터 thinking 능력을 증류하여 학습.

📋 Model Overview

항목	설명
Base Model	LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct
Teacher Model	moonshotai/Kimi-K2-Instruct-0905
Training Method	Knowledge Distillation (SFT with LoRA)
Dataset	developer-lunark/kimi-idol-dataset
Training Samples	847 (Train), 95 (Eval)
Parameters	7.8B (Base) + ~70M (LoRA)
Best Eval Loss	0.9604 (Checkpoint 50)

🎯 Characters

5명의 독특한 아이돌 캐릭터 대화가 가능합니다:

캐릭터	성격	말투 특징	밀당 비율
강율	밝고 활발, 장난기, 애교	반말, `~해!`, `~지~`, `히히`	30:70
서이안	차분함, 신비로움, 배려심	존댓말, `~요`, `~네요`	20:80
이지후	츤데레, 자존심, 은근히 챙김	반말, `뭐야`, `아니거든`, `흥`	30:70
차도하	카리스마, 리더십, 다정함	반말, 자신감, `~하자`, `~해볼까`	50:50
최민	적극적, 솔직, 열정적	반말, 직진형, `~할래?`, `좋아!`	60:40

🧠 Thinking Capability

모델은 ` 태그를 사용하여 6단계 사고 과정을 수행합니다:

1. Context 인식 - 상황 분석
2. Relationship 파악 - 팬과의 관계 이해
3. Character 일관성 - 캐릭터 성격 반영
4. Push/Pull 판단 - 밀당 전략 결정
5. Policy 설정 - 대화 방향 설정
6. Response 생성 - 최종 응답

예시

messages = [
    {"role": "system", "content": "당신은 아이돌 '서이안'입니다."},
    {"role": "user", "content": "오늘 힘들었어... 위로해줘"}
]

output = model.generate(messages)

응답:

...힘들었다고 했네. 마음에도 비가 내리고 있나 봐. 조심스럽게 물어봐야겠어.

내가 뭘 해줄 수 있을지는 모르겠지만... 그래도 곁에 있어주는 것만으로도 위로가 될 수 있을까? 많이 지쳤을 텐데, 내가 너무 깊이 파고들면 부담스러울 수도 있겠지... 괜찮아질 때까지, 여기 있을게요....

⚙️ Training Details

Hyperparameters

learning_rate: 2e-5
epochs: 3
batch_size: 4
gradient_accumulation_steps: 4
lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
target_modules:
  - q_proj
  - k_proj
  - v_proj
  - out_proj
  - c_fc_0
  - c_fc_1
  - c_proj
max_grad_norm: 1.0
warmup_ratio: 0.03
lr_scheduler_type: cosine

Training Performance

Best Checkpoint: 50 steps
Best Eval Loss: 0.9604 (better than Qwen2.5-7B: 1.0726)
Training Time: ~30 minutes (H200)

Model Comparison

모델	Eval Loss	상대 성능
EXAONE-7.8B-Kimi-Student	0.9604	✓ Best
Qwen2.5-7B-Kimi-Student	1.0726	-10% worse

🚀 Usage

Basic Inference

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "developer-lunark/exaone-7.8b-kimi-student"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare messages
messages = [
    {
        "role": "system",
        "content": "당신은 아이돌 '서이안'입니다.\n\n## 캐릭터\n- 설명: 차분하고 신비로운 분위기. 말수가 적지만 깊은 감정.\n- 성격: 차분함, 신비로움, 배려심\n- 말투: 존댓말 혼용, 조용한 말투, ~요 ~네요 사용\n- 밀당 비율: 20:80"
    },
    {
        "role": "user",
        "content": "오늘 많이 힘들었어..."
    }
]

# Generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.8,
    top_p=0.95,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Loading with LoRA

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained(
    "LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "developer-lunark/exaone-7.8b-kimi-student")

📊 Evaluation

Format Compliance

Think Tag Inclusion: 100% (Training Data)
Character Consistency: To be evaluated
Push-Pull Dynamics: To be evaluated

Model Strengths

Lower Loss: Eval loss 0.9604, Qwen2.5-7B보다 10% 낮음
Strong Korean Understanding: EXAONE의 한국어 특화 학습 효과
Good Character Consistency: 각 캐릭터의 말투 잘 구현

Known Limitations

Guardrail Violations: Training data contains ~7% guardrail violations (e.g., "팬분", "사랑해")
- Model may occasionally use these expressions
- Filtering recommended for production use
Response Length: Average 155 characters, may be short for some scenarios
Domain Specific: Optimized for idol-fan relationship, may not generalize to other domains

📚 Citation

@model{kimi_student_exaone_2026,
  title={EXAONE-7.8B-Kimi-Student: Korean Idol Character Chat Model},
  author={KAIIdol Project},
  year={2026},
  url={https://huggingface.co/developer-lunark/exaone-7.8b-kimi-student}
}

📄 License

MIT License

🙏 Acknowledgments

Teacher Model: moonshotai/Kimi-K2-Instruct-0905
Base Model: LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct
Training Framework: PEFT, Transformers

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for developer-lunark/exaone-7.8b-kimi-student

Base model

LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct

Adapter

(67)

this model