--- license: apache-2.0 base_model: unsloth/Qwen3-30B-A3B-Instruct-2507 tags: - kaidol - roleplay - korean - qwen3 - lora - unsloth language: - ko - en pipeline_tag: text-generation --- # KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K Korean AI Idol Roleplay Language Model based on unsloth/Qwen3-30B-A3B-Instruct-2507 ## Model Description 이 모델은 K-pop 아이돌 스타일의 롤플레잉 및 공감 대화를 위해 fine-tuning된 LoRA adapter입니다. - **Base Model**: unsloth/Qwen3-30B-A3B-Instruct-2507 - **Training Phase**: phase2a-test-1k - **Training Framework**: Unsloth 2025.11.3 - **LoRA Rank**: 16 - **LoRA Alpha**: 16 - **Training Samples**: 1000 ## Training Configuration ```json { "model": "Qwen3-30B-A3B-Instruct-2507", "phase": "phase2a-test-1k", "dataset": "phase2-rp-base-1k", "num_samples": 1000, "lora_rank": 16, "lora_alpha": 16, "lora_dropout": 0, "learning_rate": 0.0002, "batch_size": 2, "gradient_accumulation_steps": 4, "effective_batch_size": 32, "max_steps": 100, "warmup_steps": 10, "max_seq_length": 2048, "optimizer": "adamw_8bit", "weight_decay": 0.01, "lr_scheduler_type": "linear", "precision": "bfloat16", "device_map": "auto", "gpus": "4x RTX 5090", "training_time": "40 minutes", "framework": "Unsloth 2025.11.3", "target_modules": [ "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj" ] } ``` ## Evaluation Metrics ```json { "training_loss": { "initial": 2.3745, "final": 1.5027, "reduction_percent": 36.7 }, "training_metrics": { "total_steps": 100, "total_samples": 1000, "training_time_seconds": 2380.49, "training_time_minutes": 39.67, "samples_per_second": 0.336, "final_grad_norm": 0.1539, "final_learning_rate": 0.0 }, "loss_progression": { "step_5": 2.3745, "step_10": 1.531, "step_50": 1.632, "step_100": 1.5027 }, "wandb_run": "https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning/runs/brryct5m", "notes": "Baseline test with 1K samples. Stable convergence observed. Ready for hyperparameter optimization (LR 2e-4→1e-4, alpha 16→32, grad_accum 4→8)." } ``` ## Usage ### 로드 방법 (Unsloth 사용) ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name="developer-lunark/kaidol-phase2a-test-1k", max_seq_length=2048, dtype=None, load_in_4bit=True, ) ``` ### 추론 예시 ```python messages = [ {"role": "user", "content": "오늘 기분이 좋지 않아..."}, ] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" ).to("cuda") outputs = model.generate( inputs, max_new_tokens=512, temperature=0.7, top_p=0.9, ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Dataset - **Phase 2**: RP Base Dataset (54K samples) - Source: `developer-lunark/kaidol-phase2-rp-base-v0.1` - Korean: 53% / English: 47% ## Training Hardware - **GPU**: 4x NVIDIA RTX 5090 (32GB each) - **Training Time**: ~40 minutes - **Framework**: Unsloth + PyTorch 2.9.1 + CUDA 12.8 ## Limitations - 이 모델은 롤플레잉 및 공감 대화에 특화되어 있습니다 - 일반적인 지식 질문이나 reasoning 작업에는 베이스 모델보다 성능이 낮을 수 있습니다 - 한국어와 영어 외의 언어는 제한적으로 지원됩니다 ## Ethical Considerations - 이 모델은 연구 및 교육 목적으로 제작되었습니다 - 상업적 사용 시 라이선스를 확인하세요 - 생성된 콘텐츠의 품질과 적절성을 항상 검증하세요 ## Citation ```bibtex @misc{kaidol-phase2a-test-1k, author = {Developer Lunark}, title = {KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/developer-lunark/kaidol-phase2a-test-1k}} } ``` ## Model Card Contact - **Developer**: developer_lunark - **Repository**: https://github.com/developer-lunark/kaidol-llm-finetuning - **W&B Project**: https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning --- Generated on 2025-11-18 09:24:35