aitfindonesia
/

KomdigiUB-8B-Instruct-DTP

+---
+base_model:
+- aitfindonesia/Bakti-8B-Base
+- aitfindonesia/KominfoUB-8B-Base
+library_name: peft
+license: apache-2.0
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:aitfindonesia/Bakti-8B-Base
+- lora
+- sft
+- transformers
+- unsloth
+- multi-turn
+- chatbot
+- indonesian
+datasets:
+- dtp-fine-tuning/dtp-multiturn-interview-valid-15k
+language:
+- id
+---
+# Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot
+## Model Details
+### Model Description
+This model is a fine-tuned version of **[aitfindonesia/Bakti-8B-Base]** designed specifically for **multi-turn conversational capabilities** in the Indonesian language. It was trained using the **Unsloth** library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation).
+The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants.
+- **Developed by:** DTP Fine Tuning Team
+- **Model type:** Causal Language Model (Fine-tuned Qwen2/3 architecture)
+- **Language(s) (NLP):** Indonesian
+- **License:** Apache 2.0
+- **Finetuned from model:** aitfindonesia/Bakti-8B-Base
+## Uses
+### Direct Use
+The model is designed for:
+- Multi-turn chat interactions in Indonesian.
+- Question Answering (QA) requiring context from previous turns.
+- Roleplay interactions (e.g., interview scenarios).
+### Out-of-Scope Use
+- The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible.
+- Not intended for code generation tasks.
+## Training Details
+### Training Data
+**Dataset:** `dtp-fine-tuning/dtp-multiturn-interview-valid-15k`
+- **Split:** Train (90%) / Test (10%)
+- **Format:** Multi-turn conversation format.
+- **Max Length:** 2048 tokens
+### Training Procedure
+The model was fine-tuned using **Unsloth** on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA.
+#### Training Hyperparameters
+- **Training regime:** QLoRA (4-bit quantization with FP16 precision)
+- **Optimizer:** AdamW 8-bit
+- **Learning Rate:** $$2 \times 10^{-5}$$
+- **Scheduler:** Linear with 5% warmup
+- **Batch Size:** 8 per device (Gradient Accumulation: 4)
+- **Epochs:** 2
+- **LoRA Config:**
+    - Rank ($$r$$): 16
+    - Alpha ($$\alpha$$): 32
+    - Dropout: 0.05
+    - Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+#### Hardware
+- **GPU:** NVIDIA A100 80GB PCIe
+- **VRAM Usage:** Peak allocation approx. 19GB (23% utilization) due to 4-bit loading.
+## Evaluation
+### Results
+The model demonstrates strong convergence on the multi-turn dataset.
+- **Final Train Loss:** $$\approx 0.42$$
+- **Final Eval Loss:** $$\approx 0.41$$
+*Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.*
+## Environmental Impact
+- **Hardware Type:** NVIDIA A100 80GB
+- **Compute Region:** asia-east1
+- **Carbon Emitted:** 0.31
+## Framework Versions
+- Unsloth
+- PEFT
+- Transformers
+- TRL