|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: unsloth/Qwen3-30B-A3B-Instruct-2507 |
|
|
tags: |
|
|
- kaidol |
|
|
- roleplay |
|
|
- korean |
|
|
- qwen3 |
|
|
- lora |
|
|
- unsloth |
|
|
language: |
|
|
- ko |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K |
|
|
|
|
|
Korean AI Idol Roleplay Language Model based on unsloth/Qwen3-30B-A3B-Instruct-2507 |
|
|
|
|
|
## Model Description |
|
|
|
|
|
์ด ๋ชจ๋ธ์ K-pop ์์ด๋ ์คํ์ผ์ ๋กคํ๋ ์ ๋ฐ ๊ณต๊ฐ ๋ํ๋ฅผ ์ํด fine-tuning๋ LoRA adapter์
๋๋ค. |
|
|
|
|
|
- **Base Model**: unsloth/Qwen3-30B-A3B-Instruct-2507 |
|
|
- **Training Phase**: phase2a-test-1k |
|
|
- **Training Framework**: Unsloth 2025.11.3 |
|
|
- **LoRA Rank**: 16 |
|
|
- **LoRA Alpha**: 16 |
|
|
- **Training Samples**: 1000 |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
```json |
|
|
{ |
|
|
"model": "Qwen3-30B-A3B-Instruct-2507", |
|
|
"phase": "phase2a-test-1k", |
|
|
"dataset": "phase2-rp-base-1k", |
|
|
"num_samples": 1000, |
|
|
"lora_rank": 16, |
|
|
"lora_alpha": 16, |
|
|
"lora_dropout": 0, |
|
|
"learning_rate": 0.0002, |
|
|
"batch_size": 2, |
|
|
"gradient_accumulation_steps": 4, |
|
|
"effective_batch_size": 32, |
|
|
"max_steps": 100, |
|
|
"warmup_steps": 10, |
|
|
"max_seq_length": 2048, |
|
|
"optimizer": "adamw_8bit", |
|
|
"weight_decay": 0.01, |
|
|
"lr_scheduler_type": "linear", |
|
|
"precision": "bfloat16", |
|
|
"device_map": "auto", |
|
|
"gpus": "4x RTX 5090", |
|
|
"training_time": "40 minutes", |
|
|
"framework": "Unsloth 2025.11.3", |
|
|
"target_modules": [ |
|
|
"q_proj", |
|
|
"k_proj", |
|
|
"v_proj", |
|
|
"o_proj", |
|
|
"gate_proj", |
|
|
"up_proj", |
|
|
"down_proj" |
|
|
] |
|
|
} |
|
|
``` |
|
|
|
|
|
## Evaluation Metrics |
|
|
|
|
|
```json |
|
|
{ |
|
|
"training_loss": { |
|
|
"initial": 2.3745, |
|
|
"final": 1.5027, |
|
|
"reduction_percent": 36.7 |
|
|
}, |
|
|
"training_metrics": { |
|
|
"total_steps": 100, |
|
|
"total_samples": 1000, |
|
|
"training_time_seconds": 2380.49, |
|
|
"training_time_minutes": 39.67, |
|
|
"samples_per_second": 0.336, |
|
|
"final_grad_norm": 0.1539, |
|
|
"final_learning_rate": 0.0 |
|
|
}, |
|
|
"loss_progression": { |
|
|
"step_5": 2.3745, |
|
|
"step_10": 1.531, |
|
|
"step_50": 1.632, |
|
|
"step_100": 1.5027 |
|
|
}, |
|
|
"wandb_run": "https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning/runs/brryct5m", |
|
|
"notes": "Baseline test with 1K samples. Stable convergence observed. Ready for hyperparameter optimization (LR 2e-4โ1e-4, alpha 16โ32, grad_accum 4โ8)." |
|
|
} |
|
|
``` |
|
|
|
|
|
|
|
|
## Usage |
|
|
|
|
|
### ๋ก๋ ๋ฐฉ๋ฒ (Unsloth ์ฌ์ฉ) |
|
|
|
|
|
```python |
|
|
from unsloth import FastLanguageModel |
|
|
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
|
model_name="developer-lunark/kaidol-phase2a-test-1k", |
|
|
max_seq_length=2048, |
|
|
dtype=None, |
|
|
load_in_4bit=True, |
|
|
) |
|
|
``` |
|
|
|
|
|
### ์ถ๋ก ์์ |
|
|
|
|
|
```python |
|
|
messages = [ |
|
|
{"role": "user", "content": "์ค๋ ๊ธฐ๋ถ์ด ์ข์ง ์์..."}, |
|
|
] |
|
|
|
|
|
inputs = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=True, |
|
|
add_generation_prompt=True, |
|
|
return_tensors="pt" |
|
|
).to("cuda") |
|
|
|
|
|
outputs = model.generate( |
|
|
inputs, |
|
|
max_new_tokens=512, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
## Dataset |
|
|
|
|
|
- **Phase 2**: RP Base Dataset (54K samples) |
|
|
- Source: `developer-lunark/kaidol-phase2-rp-base-v0.1` |
|
|
- Korean: 53% / English: 47% |
|
|
|
|
|
## Training Hardware |
|
|
|
|
|
- **GPU**: 4x NVIDIA RTX 5090 (32GB each) |
|
|
- **Training Time**: ~40 minutes |
|
|
- **Framework**: Unsloth + PyTorch 2.9.1 + CUDA 12.8 |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- ์ด ๋ชจ๋ธ์ ๋กคํ๋ ์ ๋ฐ ๊ณต๊ฐ ๋ํ์ ํนํ๋์ด ์์ต๋๋ค |
|
|
- ์ผ๋ฐ์ ์ธ ์ง์ ์ง๋ฌธ์ด๋ reasoning ์์
์๋ ๋ฒ ์ด์ค ๋ชจ๋ธ๋ณด๋ค ์ฑ๋ฅ์ด ๋ฎ์ ์ ์์ต๋๋ค |
|
|
- ํ๊ตญ์ด์ ์์ด ์ธ์ ์ธ์ด๋ ์ ํ์ ์ผ๋ก ์ง์๋ฉ๋๋ค |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- ์ด ๋ชจ๋ธ์ ์ฐ๊ตฌ ๋ฐ ๊ต์ก ๋ชฉ์ ์ผ๋ก ์ ์๋์์ต๋๋ค |
|
|
- ์์
์ ์ฌ์ฉ ์ ๋ผ์ด์ ์ค๋ฅผ ํ์ธํ์ธ์ |
|
|
- ์์ฑ๋ ์ฝํ
์ธ ์ ํ์ง๊ณผ ์ ์ ์ฑ์ ํญ์ ๊ฒ์ฆํ์ธ์ |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{kaidol-phase2a-test-1k, |
|
|
author = {Developer Lunark}, |
|
|
title = {KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K}, |
|
|
year = {2025}, |
|
|
publisher = {HuggingFace}, |
|
|
howpublished = {\url{https://huggingface.co/developer-lunark/kaidol-phase2a-test-1k}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
- **Developer**: developer_lunark |
|
|
- **Repository**: https://github.com/developer-lunark/kaidol-llm-finetuning |
|
|
- **W&B Project**: https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning |
|
|
|
|
|
--- |
|
|
|
|
|
Generated on 2025-11-18 09:24:35 |
|
|
|