Upload phase2a-test-1k model - 2025-11-18 09:24:35

54625cf verified 2 months ago

4.27 kB

	---
	license: apache-2.0
	base_model: unsloth/Qwen3-30B-A3B-Instruct-2507
	tags:
	- kaidol
	- roleplay
	- korean
	- qwen3
	- lora
	- unsloth
	language:
	- ko
	- en
	pipeline_tag: text-generation
	---

	# KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K

	Korean AI Idol Roleplay Language Model based on unsloth/Qwen3-30B-A3B-Instruct-2507

	## Model Description

	이 모델은 K-pop 아이돌 스타일의 롤플레잉 및 공감 대화를 위해 fine-tuning된 LoRA adapter입니다.

	- Base Model: unsloth/Qwen3-30B-A3B-Instruct-2507
	- Training Phase: phase2a-test-1k
	- Training Framework: Unsloth 2025.11.3
	- LoRA Rank: 16
	- LoRA Alpha: 16
	- Training Samples: 1000

	## Training Configuration

	```json
	{
	"model": "Qwen3-30B-A3B-Instruct-2507",
	"phase": "phase2a-test-1k",
	"dataset": "phase2-rp-base-1k",
	"num_samples": 1000,
	"lora_rank": 16,
	"lora_alpha": 16,
	"lora_dropout": 0,
	"learning_rate": 0.0002,
	"batch_size": 2,
	"gradient_accumulation_steps": 4,
	"effective_batch_size": 32,
	"max_steps": 100,
	"warmup_steps": 10,
	"max_seq_length": 2048,
	"optimizer": "adamw_8bit",
	"weight_decay": 0.01,
	"lr_scheduler_type": "linear",
	"precision": "bfloat16",
	"device_map": "auto",
	"gpus": "4x RTX 5090",
	"training_time": "40 minutes",
	"framework": "Unsloth 2025.11.3",
	"target_modules": [
	"q_proj",
	"k_proj",
	"v_proj",
	"o_proj",
	"gate_proj",
	"up_proj",
	"down_proj"
	]
	}
	```

	## Evaluation Metrics

	```json
	{
	"training_loss": {
	"initial": 2.3745,
	"final": 1.5027,
	"reduction_percent": 36.7
	},
	"training_metrics": {
	"total_steps": 100,
	"total_samples": 1000,
	"training_time_seconds": 2380.49,
	"training_time_minutes": 39.67,
	"samples_per_second": 0.336,
	"final_grad_norm": 0.1539,
	"final_learning_rate": 0.0
	},
	"loss_progression": {
	"step_5": 2.3745,
	"step_10": 1.531,
	"step_50": 1.632,
	"step_100": 1.5027
	},
	"wandb_run": "https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning/runs/brryct5m",
	"notes": "Baseline test with 1K samples. Stable convergence observed. Ready for hyperparameter optimization (LR 2e-4→1e-4, alpha 16→32, grad_accum 4→8)."
	}
	```


	## Usage

	### 로드 방법 (Unsloth 사용)

	```python
	from unsloth import FastLanguageModel

	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name="developer-lunark/kaidol-phase2a-test-1k",
	max_seq_length=2048,
	dtype=None,
	load_in_4bit=True,
	)
	```

	### 추론 예시

	```python
	messages = [
	{"role": "user", "content": "오늘 기분이 좋지 않아..."},
	]

	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt"
	).to("cuda")

	outputs = model.generate(
	inputs,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.9,
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## Dataset

	- Phase 2: RP Base Dataset (54K samples)
	- Source: `developer-lunark/kaidol-phase2-rp-base-v0.1`
	- Korean: 53% / English: 47%

	## Training Hardware

	- GPU: 4x NVIDIA RTX 5090 (32GB each)
	- Training Time: ~40 minutes
	- Framework: Unsloth + PyTorch 2.9.1 + CUDA 12.8

	## Limitations

	- 이 모델은 롤플레잉 및 공감 대화에 특화되어 있습니다
	- 일반적인 지식 질문이나 reasoning 작업에는 베이스 모델보다 성능이 낮을 수 있습니다
	- 한국어와 영어 외의 언어는 제한적으로 지원됩니다

	## Ethical Considerations

	- 이 모델은 연구 및 교육 목적으로 제작되었습니다
	- 상업적 사용 시 라이선스를 확인하세요
	- 생성된 콘텐츠의 품질과 적절성을 항상 검증하세요

	## Citation

	```bibtex
	@misc{kaidol-phase2a-test-1k,
	author = {Developer Lunark},
	title = {KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K},
	year = {2025},
	publisher = {HuggingFace},
	howpublished = {\url{https://huggingface.co/developer-lunark/kaidol-phase2a-test-1k}}
	}
	```

	## Model Card Contact

	- Developer: developer_lunark
	- Repository: https://github.com/developer-lunark/kaidol-llm-finetuning
	- W&B Project: https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning

	---

	Generated on 2025-11-18 09:24:35