developer-lunark's picture
Upload phase2a-test-1k model - 2025-11-18 09:24:35
54625cf verified
---
license: apache-2.0
base_model: unsloth/Qwen3-30B-A3B-Instruct-2507
tags:
- kaidol
- roleplay
- korean
- qwen3
- lora
- unsloth
language:
- ko
- en
pipeline_tag: text-generation
---
# KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K
Korean AI Idol Roleplay Language Model based on unsloth/Qwen3-30B-A3B-Instruct-2507
## Model Description
์ด ๋ชจ๋ธ์€ K-pop ์•„์ด๋Œ ์Šคํƒ€์ผ์˜ ๋กคํ”Œ๋ ˆ์ž‰ ๋ฐ ๊ณต๊ฐ ๋Œ€ํ™”๋ฅผ ์œ„ํ•ด fine-tuning๋œ LoRA adapter์ž…๋‹ˆ๋‹ค.
- **Base Model**: unsloth/Qwen3-30B-A3B-Instruct-2507
- **Training Phase**: phase2a-test-1k
- **Training Framework**: Unsloth 2025.11.3
- **LoRA Rank**: 16
- **LoRA Alpha**: 16
- **Training Samples**: 1000
## Training Configuration
```json
{
"model": "Qwen3-30B-A3B-Instruct-2507",
"phase": "phase2a-test-1k",
"dataset": "phase2-rp-base-1k",
"num_samples": 1000,
"lora_rank": 16,
"lora_alpha": 16,
"lora_dropout": 0,
"learning_rate": 0.0002,
"batch_size": 2,
"gradient_accumulation_steps": 4,
"effective_batch_size": 32,
"max_steps": 100,
"warmup_steps": 10,
"max_seq_length": 2048,
"optimizer": "adamw_8bit",
"weight_decay": 0.01,
"lr_scheduler_type": "linear",
"precision": "bfloat16",
"device_map": "auto",
"gpus": "4x RTX 5090",
"training_time": "40 minutes",
"framework": "Unsloth 2025.11.3",
"target_modules": [
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj"
]
}
```
## Evaluation Metrics
```json
{
"training_loss": {
"initial": 2.3745,
"final": 1.5027,
"reduction_percent": 36.7
},
"training_metrics": {
"total_steps": 100,
"total_samples": 1000,
"training_time_seconds": 2380.49,
"training_time_minutes": 39.67,
"samples_per_second": 0.336,
"final_grad_norm": 0.1539,
"final_learning_rate": 0.0
},
"loss_progression": {
"step_5": 2.3745,
"step_10": 1.531,
"step_50": 1.632,
"step_100": 1.5027
},
"wandb_run": "https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning/runs/brryct5m",
"notes": "Baseline test with 1K samples. Stable convergence observed. Ready for hyperparameter optimization (LR 2e-4โ†’1e-4, alpha 16โ†’32, grad_accum 4โ†’8)."
}
```
## Usage
### ๋กœ๋“œ ๋ฐฉ๋ฒ• (Unsloth ์‚ฌ์šฉ)
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="developer-lunark/kaidol-phase2a-test-1k",
max_seq_length=2048,
dtype=None,
load_in_4bit=True,
)
```
### ์ถ”๋ก  ์˜ˆ์‹œ
```python
messages = [
{"role": "user", "content": "์˜ค๋Š˜ ๊ธฐ๋ถ„์ด ์ข‹์ง€ ์•Š์•„..."},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to("cuda")
outputs = model.generate(
inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Dataset
- **Phase 2**: RP Base Dataset (54K samples)
- Source: `developer-lunark/kaidol-phase2-rp-base-v0.1`
- Korean: 53% / English: 47%
## Training Hardware
- **GPU**: 4x NVIDIA RTX 5090 (32GB each)
- **Training Time**: ~40 minutes
- **Framework**: Unsloth + PyTorch 2.9.1 + CUDA 12.8
## Limitations
- ์ด ๋ชจ๋ธ์€ ๋กคํ”Œ๋ ˆ์ž‰ ๋ฐ ๊ณต๊ฐ ๋Œ€ํ™”์— ํŠนํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค
- ์ผ๋ฐ˜์ ์ธ ์ง€์‹ ์งˆ๋ฌธ์ด๋‚˜ reasoning ์ž‘์—…์—๋Š” ๋ฒ ์ด์Šค ๋ชจ๋ธ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋‚ฎ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
- ํ•œ๊ตญ์–ด์™€ ์˜์–ด ์™ธ์˜ ์–ธ์–ด๋Š” ์ œํ•œ์ ์œผ๋กœ ์ง€์›๋ฉ๋‹ˆ๋‹ค
## Ethical Considerations
- ์ด ๋ชจ๋ธ์€ ์—ฐ๊ตฌ ๋ฐ ๊ต์œก ๋ชฉ์ ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค
- ์ƒ์—…์  ์‚ฌ์šฉ ์‹œ ๋ผ์ด์„ ์Šค๋ฅผ ํ™•์ธํ•˜์„ธ์š”
- ์ƒ์„ฑ๋œ ์ฝ˜ํ…์ธ ์˜ ํ’ˆ์งˆ๊ณผ ์ ์ ˆ์„ฑ์„ ํ•ญ์ƒ ๊ฒ€์ฆํ•˜์„ธ์š”
## Citation
```bibtex
@misc{kaidol-phase2a-test-1k,
author = {Developer Lunark},
title = {KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/developer-lunark/kaidol-phase2a-test-1k}}
}
```
## Model Card Contact
- **Developer**: developer_lunark
- **Repository**: https://github.com/developer-lunark/kaidol-llm-finetuning
- **W&B Project**: https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning
---
Generated on 2025-11-18 09:24:35