File size: 3,021 Bytes
270e900
 
 
162aa7b
270e900
 
 
 
 
 
 
162aa7b
 
 
270e900
 
162aa7b
270e900
 
 
 
 
162aa7b
270e900
162aa7b
270e900
162aa7b
 
 
 
 
270e900
 
 
 
 
162aa7b
 
 
 
270e900
 
 
162aa7b
 
270e900
 
 
 
 
162aa7b
 
 
 
270e900
 
 
162aa7b
270e900
 
 
162aa7b
 
 
 
 
 
 
 
 
 
 
270e900
162aa7b
 
 
270e900
 
 
 
 
162aa7b
 
 
270e900
162aa7b
270e900
 
 
162aa7b
 
 
270e900
162aa7b
270e900
162aa7b
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
base_model: aitfindonesia/Bakti-8B-Base
library_name: peft
license: apache-2.0
pipeline_tag: text-generation
tags:
- base_model:adapter:aitfindonesia/Bakti-8B-Base
- lora
- sft
- transformers
- unsloth
- multi-turn
- chatbot
- indonesian
---

# Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot

## Model Details

### Model Description

This model is a fine-tuned version of **[aitfindonesia/Bakti-8B-Base]** designed specifically for **multi-turn conversational capabilities** in the Indonesian language. It was trained using the **Unsloth** library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation).

The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants.

- **Developed by:** DTP Fine Tuning Team
- **Model type:** Causal Language Model (Fine-tuned Qwen2/3 architecture)
- **Language(s) (NLP):** Indonesian
- **License:** Apache 2.0
- **Finetuned from model:** aitfindonesia/Bakti-8B-Base

## Uses

### Direct Use

The model is designed for:
- Multi-turn chat interactions in Indonesian.
- Question Answering (QA) requiring context from previous turns.
- Roleplay interactions (e.g., interview scenarios).

### Out-of-Scope Use

- The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible.
- Not intended for code generation tasks.

## Training Details

### Training Data

**Dataset:** `dtp-fine-tuning/dtp-multiturn-interview-valid-15k`
- **Split:** Train (90%) / Test (10%)
- **Format:** Multi-turn conversation format.
- **Max Length:** 2048 tokens

### Training Procedure

The model was fine-tuned using **Unsloth** on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA.

#### Training Hyperparameters

- **Training regime:** QLoRA (4-bit quantization with FP16 precision)
- **Optimizer:** AdamW 8-bit
- **Learning Rate:** $2 \times 10^{-5}$
- **Scheduler:** Linear with 5% warmup
- **Batch Size:** 8 per device (Gradient Accumulation: 4)
- **Epochs:** 2
- **LoRA Config:**
    - Rank ($r$): 16
    - Alpha ($\alpha$): 32
    - Dropout: 0.05
    - Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

#### Hardware
- **GPU:** NVIDIA A100 80GB PCIe
- **VRAM Usage:** Peak allocation approx. 19GB (23% utilization) due to 4-bit loading.

## Evaluation

### Results

The model demonstrates strong convergence on the multi-turn dataset.
- **Final Train Loss:** $\approx 0.42$
- **Final Eval Loss:** $\approx 0.41$

*Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.*

## Environmental Impact

- **Hardware Type:** NVIDIA A100 80GB
- **Compute Region:** asia-east1
- **Carbon Emitted:** 0.31

## Framework Versions

- Unsloth
- PEFT
- Transformers
- TRL