qwen32b-thai-lora / README.md

devrf

Update README.md

b3e4254 verified about 2 months ago

preview code

raw

history blame contribute delete

4.04 kB

metadata

library_name: peft
license: apache-2.0
base_model: Qwen/Qwen3-32B
tags:
  - axolotl
  - base_model:adapter:Qwen/Qwen3-32B
  - lora
  - transformers
pipeline_tag: text-generation
model-index:
  - name: outputs/qwen32b-thai
    results: []

See axolotl config

axolotl version: 0.13.0.dev0

adapter: lora
base_model: Qwen/Qwen3-32B
bf16: true
flash_attention: true
gradient_checkpointing: true

datasets:
- path: /workspace/data/wangchan_fixed
  type: alpaca
  split: train

val_set_size: 0
sequence_len: 2048
train_on_inputs: false

micro_batch_size: 4
gradient_accumulation_steps: 8

optimizer: adamw_torch
learning_rate: 1.0e-4
lr_scheduler: cosine
warmup_ratio: 0.03
weight_decay: 0.01
max_grad_norm: 1.0
num_epochs: 2

lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
lora_target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- down_proj
- up_proj

output_dir: ./outputs/qwen32b-thai
logging_steps: 10
save_steps: 300

Qwen3-32B Thai LoRA

This model is a fine-tuned version of Qwen/Qwen3-32B on the WangchanThaiInstruct dataset for improved Thai language instruction-following capabilities.

Model Description

This LoRA adapter enhances Qwen3-32B's ability to understand and respond to Thai language instructions across various domains including finance, general knowledge, creative writing, and classification tasks.

Base Model: Qwen/Qwen3-32B
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Language: Thai (th)
Training Loss: 0.85 → 0.55

Intended Uses & Limitations

Intended Uses

Thai language question answering
Thai instruction following
Thai content generation
Financial domain queries in Thai

Limitations

Performance may vary on domains not covered in the training data
Inherits limitations of the base Qwen3-32B model
Primarily optimized for Thai; multilingual performance may differ from base model

Training and Evaluation Data

Dataset

Name: WangchanThaiInstruct
Training Samples: ~29,000 (after filtering sequences > 2048 tokens)
Format: Alpaca-style (instruction, input, output)
Domains: Finance, General Knowledge, Creative Writing, Classification, Open QA, Closed QA

Training Procedure

Hardware

GPU: 1x NVIDIA H200 SXM (141GB VRAM)
Training Time: ~10 hours

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 43
training_steps: 1444

Training Results

Step	Loss
10	0.85
20	0.78
1068	0.55
1444 (final)	~0.50

Framework versions

PEFT 0.17.1
Transformers 4.57.3
Pytorch 2.7.1+cu126
Datasets 4.3.0
Tokenizers 0.22.1

Citation

If you use this model, please cite the original dataset and base model:

@misc{wangchanthaiinstruct,
  title={WangchanThaiInstruct},
  author={AIResearch.in.th},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/datasets/airesearch/WangchanThaiInstruct}
}

@misc{qwen3,
  title={Qwen3 Technical Report},
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv}
}