dtp-fine-tuning
/

multi-turn_chatbot_diploy

Text Generation

text-generation-inference

Model card Files Files and versions

multi-turn_chatbot_diploy / README.md

wildanaziz's picture

Update README.md

162aa7b verified about 1 month ago

|

history blame contribute delete

3.02 kB

	---
	base_model: aitfindonesia/Bakti-8B-Base
	library_name: peft
	license: apache-2.0
	pipeline_tag: text-generation
	tags:
	- base_model:adapter:aitfindonesia/Bakti-8B-Base
	- lora
	- sft
	- transformers
	- unsloth
	- multi-turn
	- chatbot
	- indonesian
	---

	# Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot

	## Model Details

	### Model Description

	This model is a fine-tuned version of [aitfindonesia/Bakti-8B-Base] designed specifically for multi-turn conversational capabilities in the Indonesian language. It was trained using the Unsloth library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation).

	The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants.

	- Developed by: DTP Fine Tuning Team
	- Model type: Causal Language Model (Fine-tuned Qwen2/3 architecture)
	- Language(s) (NLP): Indonesian
	- License: Apache 2.0
	- Finetuned from model: aitfindonesia/Bakti-8B-Base

	## Uses

	### Direct Use

	The model is designed for:
	- Multi-turn chat interactions in Indonesian.
	- Question Answering (QA) requiring context from previous turns.
	- Roleplay interactions (e.g., interview scenarios).

	### Out-of-Scope Use

	- The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible.
	- Not intended for code generation tasks.

	## Training Details

	### Training Data

	Dataset: `dtp-fine-tuning/dtp-multiturn-interview-valid-15k`
	- Split: Train (90%) / Test (10%)
	- Format: Multi-turn conversation format.
	- Max Length: 2048 tokens

	### Training Procedure

	The model was fine-tuned using Unsloth on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA.

	#### Training Hyperparameters

	- Training regime: QLoRA (4-bit quantization with FP16 precision)
	- Optimizer: AdamW 8-bit
	- Learning Rate: $2 \times 10^{-5}$
	- Scheduler: Linear with 5% warmup
	- Batch Size: 8 per device (Gradient Accumulation: 4)
	- Epochs: 2
	- LoRA Config:
	- Rank ($r$): 16
	- Alpha ($\alpha$): 32
	- Dropout: 0.05
	- Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

	#### Hardware
	- GPU: NVIDIA A100 80GB PCIe
	- VRAM Usage: Peak allocation approx. 19GB (23% utilization) due to 4-bit loading.

	## Evaluation

	### Results

	The model demonstrates strong convergence on the multi-turn dataset.
	- Final Train Loss: $\approx 0.42$
	- Final Eval Loss: $\approx 0.41$

	Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.

	## Environmental Impact

	- Hardware Type: NVIDIA A100 80GB
	- Compute Region: asia-east1
	- Carbon Emitted: 0.31

	## Framework Versions

	- Unsloth
	- PEFT
	- Transformers
	- TRL