--- base_model: aitfindonesia/Bakti-8B-Base library_name: peft license: apache-2.0 pipeline_tag: text-generation tags: - base_model:adapter:aitfindonesia/Bakti-8B-Base - lora - sft - transformers - unsloth - multi-turn - chatbot - indonesian --- # Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot ## Model Details ### Model Description This model is a fine-tuned version of **[aitfindonesia/Bakti-8B-Base]** designed specifically for **multi-turn conversational capabilities** in the Indonesian language. It was trained using the **Unsloth** library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation). The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants. - **Developed by:** DTP Fine Tuning Team - **Model type:** Causal Language Model (Fine-tuned Qwen2/3 architecture) - **Language(s) (NLP):** Indonesian - **License:** Apache 2.0 - **Finetuned from model:** aitfindonesia/Bakti-8B-Base ## Uses ### Direct Use The model is designed for: - Multi-turn chat interactions in Indonesian. - Question Answering (QA) requiring context from previous turns. - Roleplay interactions (e.g., interview scenarios). ### Out-of-Scope Use - The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible. - Not intended for code generation tasks. ## Training Details ### Training Data **Dataset:** `dtp-fine-tuning/dtp-multiturn-interview-valid-15k` - **Split:** Train (90%) / Test (10%) - **Format:** Multi-turn conversation format. - **Max Length:** 2048 tokens ### Training Procedure The model was fine-tuned using **Unsloth** on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA. #### Training Hyperparameters - **Training regime:** QLoRA (4-bit quantization with FP16 precision) - **Optimizer:** AdamW 8-bit - **Learning Rate:** $2 \times 10^{-5}$ - **Scheduler:** Linear with 5% warmup - **Batch Size:** 8 per device (Gradient Accumulation: 4) - **Epochs:** 2 - **LoRA Config:** - Rank ($r$): 16 - Alpha ($\alpha$): 32 - Dropout: 0.05 - Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` #### Hardware - **GPU:** NVIDIA A100 80GB PCIe - **VRAM Usage:** Peak allocation approx. 19GB (23% utilization) due to 4-bit loading. ## Evaluation ### Results The model demonstrates strong convergence on the multi-turn dataset. - **Final Train Loss:** $\approx 0.42$ - **Final Eval Loss:** $\approx 0.41$ *Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.* ## Environmental Impact - **Hardware Type:** NVIDIA A100 80GB - **Compute Region:** asia-east1 - **Carbon Emitted:** 0.31 ## Framework Versions - Unsloth - PEFT - Transformers - TRL