---
base_model: aitfindonesia/Bakti-8B-Base
library_name: peft
license: apache-2.0
pipeline_tag: text-generation
tags:
- base_model:adapter:aitfindonesia/Bakti-8B-Base
- lora
- sft
- transformers
- unsloth
- multi-turn
- chatbot
- indonesian
---

# Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot

## Model Details

### Model Description

This model is a fine-tuned version of **[aitfindonesia/Bakti-8B-Base]** designed specifically for **multi-turn conversational capabilities** in the Indonesian language. It was trained using the **Unsloth** library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation).

The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants.

- **Developed by:** DTP Fine Tuning Team
- **Model type:** Causal Language Model (Fine-tuned Qwen2/3 architecture)
- **Language(s) (NLP):** Indonesian
- **License:** Apache 2.0
- **Finetuned from model:** aitfindonesia/Bakti-8B-Base

## Uses

### Direct Use

The model is designed for:
- Multi-turn chat interactions in Indonesian.
- Question Answering (QA) requiring context from previous turns.
- Roleplay interactions (e.g., interview scenarios).

### Out-of-Scope Use

- The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible.
- Not intended for code generation tasks.

## Training Details

### Training Data

**Dataset:** `dtp-fine-tuning/dtp-multiturn-interview-valid-15k`
- **Split:** Train (90%) / Test (10%)
- **Format:** Multi-turn conversation format.
- **Max Length:** 2048 tokens

### Training Procedure

The model was fine-tuned using **Unsloth** on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA.

#### Training Hyperparameters

- **Training regime:** QLoRA (4-bit quantization with FP16 precision)
- **Optimizer:** AdamW 8-bit
- **Learning Rate:** $2 \times 10^{-5}$
- **Scheduler:** Linear with 5% warmup
- **Batch Size:** 8 per device (Gradient Accumulation: 4)
- **Epochs:** 2
- **LoRA Config:**
    - Rank ($r$): 16
    - Alpha ($\alpha$): 32
    - Dropout: 0.05
    - Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

#### Hardware
- **GPU:** NVIDIA A100 80GB PCIe
- **VRAM Usage:** Peak allocation approx. 19GB (23% utilization) due to 4-bit loading.

## Evaluation

### Results

The model demonstrates strong convergence on the multi-turn dataset.
- **Final Train Loss:** $\approx 0.42$
- **Final Eval Loss:** $\approx 0.41$

*Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.*

## Environmental Impact

- **Hardware Type:** NVIDIA A100 80GB
- **Compute Region:** asia-east1
- **Carbon Emitted:** 0.31

## Framework Versions

- Unsloth
- PEFT
- Transformers
- TRL