|
|
--- |
|
|
base_model: aitfindonesia/Bakti-8B-Base |
|
|
library_name: peft |
|
|
license: apache-2.0 |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- base_model:adapter:aitfindonesia/Bakti-8B-Base |
|
|
- lora |
|
|
- sft |
|
|
- transformers |
|
|
- unsloth |
|
|
- multi-turn |
|
|
- chatbot |
|
|
- indonesian |
|
|
--- |
|
|
|
|
|
# Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This model is a fine-tuned version of **[aitfindonesia/Bakti-8B-Base]** designed specifically for **multi-turn conversational capabilities** in the Indonesian language. It was trained using the **Unsloth** library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation). |
|
|
|
|
|
The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants. |
|
|
|
|
|
- **Developed by:** DTP Fine Tuning Team |
|
|
- **Model type:** Causal Language Model (Fine-tuned Qwen2/3 architecture) |
|
|
- **Language(s) (NLP):** Indonesian |
|
|
- **License:** Apache 2.0 |
|
|
- **Finetuned from model:** aitfindonesia/Bakti-8B-Base |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
The model is designed for: |
|
|
- Multi-turn chat interactions in Indonesian. |
|
|
- Question Answering (QA) requiring context from previous turns. |
|
|
- Roleplay interactions (e.g., interview scenarios). |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible. |
|
|
- Not intended for code generation tasks. |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
**Dataset:** `dtp-fine-tuning/dtp-multiturn-interview-valid-15k` |
|
|
- **Split:** Train (90%) / Test (10%) |
|
|
- **Format:** Multi-turn conversation format. |
|
|
- **Max Length:** 2048 tokens |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
The model was fine-tuned using **Unsloth** on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Training regime:** QLoRA (4-bit quantization with FP16 precision) |
|
|
- **Optimizer:** AdamW 8-bit |
|
|
- **Learning Rate:** $2 \times 10^{-5}$ |
|
|
- **Scheduler:** Linear with 5% warmup |
|
|
- **Batch Size:** 8 per device (Gradient Accumulation: 4) |
|
|
- **Epochs:** 2 |
|
|
- **LoRA Config:** |
|
|
- Rank ($r$): 16 |
|
|
- Alpha ($\alpha$): 32 |
|
|
- Dropout: 0.05 |
|
|
- Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
|
|
|
|
|
#### Hardware |
|
|
- **GPU:** NVIDIA A100 80GB PCIe |
|
|
- **VRAM Usage:** Peak allocation approx. 19GB (23% utilization) due to 4-bit loading. |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Results |
|
|
|
|
|
The model demonstrates strong convergence on the multi-turn dataset. |
|
|
- **Final Train Loss:** $\approx 0.42$ |
|
|
- **Final Eval Loss:** $\approx 0.41$ |
|
|
|
|
|
*Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.* |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Hardware Type:** NVIDIA A100 80GB |
|
|
- **Compute Region:** asia-east1 |
|
|
- **Carbon Emitted:** 0.31 |
|
|
|
|
|
## Framework Versions |
|
|
|
|
|
- Unsloth |
|
|
- PEFT |
|
|
- Transformers |
|
|
- TRL |