You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

โ˜๏ธ SiamGPT-32B: A Robust Spokesperson for Linguistically Stable Thai Generation

๐Ÿ‡น๐Ÿ‡ญ SiamGPT-32B is a fine-tuned variant of Qwen3-32B, optimized specifically for high-fidelity Thai language generation, instruction following, and multi-turn dialogue stability. Developed by SIAM.AI, this model addresses the critical challenge of "multilingual interference" (code-switching) often found in open-weight models when processing Thai.

โ„น๏ธ Model Description

  • Model type: A 32B instruct decoder-only model based on Qwen3 architecture.
  • Primary Language(s): Thai ๐Ÿ‡น๐Ÿ‡ญ and English ๐Ÿ‡ฌ๐Ÿ‡ง
  • Context Length: 8,192 tokens
  • License: Apache 2.0 License

๐Ÿ† Achieves the highest overall score among open-weights models in the 30B-32B class on the SEA-HELM Thai benchmarks.

๐Ÿš€ Key Features

  • ๐Ÿ—ฃ๏ธ Reduced Code-Switching: Specifically trained to mitigate the injection of Chinese, Hindi, or English tokens into Thai sentences, ensuring output suitable for production user-facing applications.
  • ๐Ÿง  Agentic Focus: Designed as a final response synthesizer for multi-agent systems, prioritizing strict formatting constraints and reasoning over open-ended creative writing.

๐ŸŽฎ Try the Demo

Experience SiamGPT-32B in action! We use this model as the final responder for our SiamGPT Agentic Tourism platform. See how it handles complex Thai queries and multi-agent synthesis yourself:

Chat with SiamGPT

๐Ÿฅ‡ SEA-HELM Leaderboard (Thai)

Full comparison against regional models (Typhoon 2.5 & OpenThaiGPT)

Metric SiamGPT-32B (Ours) Typhoon 2.5 OpenThaiGPT R1
Total Overall 63.59 60.44 55.28
Instruction Following 83.00 79.00 54.00
Multi-turn 75.81 76.16 59.69
NLU (Understanding) 67.95 65.56 59.89
NLG (Generation) 42.06 56.70 54.31
NLR (Reasoning) 68.59 55.54 65.38
Safety 44.19 29.68 41.42

๐Ÿ“ˆ Stability & Control vs. Base Model

Full ablation analysis against Qwen3-32B baseline

Metric Qwen3-32B SiamGPT-32B Improvement
Stability (Code Switch) 87.70 90.40 +2.70
Instruction Following (IF-Eval) 75.47 83.00 +7.53
Multi-Turn Dialogue (MT-Bench) 57.94 75.81 +17.87
Thai Exam (Knowledge) 61.40 63.00 +1.60
NLU (Natural Language Understanding) 59.80 67.95 +8.15

๐Ÿ’ป Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "siamaids/SiamGPT-32B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="auto"
)

messages = [
    {"role": "system", "content": "You are SiamGPT, a helpful Thai AI assistant."},
    {"role": "user", "content": "เธŠเนˆเธงเธขเนเธ™เธฐเธ™เธณเธชเธ–เธฒเธ™เธ—เธตเนˆเน€เธ—เธตเนˆเธขเธงเนƒเธ™เน€เธŠเธตเธขเธ‡เนƒเธซเธกเนˆเนƒเธซเน‰เธซเธ™เนˆเธญเธข"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

โšก๏ธ Deploy via vLLM

  • vllm >= 0.8.5
pip install vllm
vllm serve siamaids/SiamGPT-32B --max-model-len 8192 --reasoning-parser qwen3 --gpu-memory-utilization 0.95

๐Ÿ’ฅ Deploy via SGLang

  • sglang >= 0.4.6post1
pip install sglang
python -m sglang.launch_server --model-path siamaids/SiamGPT-32B --reasoning-parser qwen3

โš ๏ธ Limitations & Risks

  • "Translationese" Artifacts: Due to the heavy reliance on translated instruction data, the model may occasionally exhibit phrasing that, while grammatically correct, lacks the stylistic naturalness of native Thai speakers.
  • Looping in Creative Writing: The model is optimized for agentic, instruction-following tasks.When used for open-ended creative writing without strong context anchoring, it may exhibit repetition or looping.
  • Factuality: Like all LLMs, SiamGPT can hallucinate.It is recommended for use as a response synthesizer in RAG (Retrieval-Augmented Generation) pipelines where ground truth is provided in the context.

๐Ÿ“š Citation

  • If you find SiamGPT useful for your work, please cite it using:
@misc{pairatsuppawat2025siamgptqualityfirstfinetuningstable,
      title={SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation}, 
      author={Thittipat Pairatsuppawat and Abhibhu Tachaapornchai and Paweekorn Kusolsomboon and Chutikan Chaiwong and Thodsaporn Chay-intr and Kobkrit Viriyayudhakorn and Nongnuch Ketui and Aslan B. Wong},
      year={2025},
      eprint={2512.19455},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.19455}, 
}

๐Ÿค Follow & Support Us

๐Ÿ™ Acknowledgements

We would like to express our deepest gratitude to everyone who uses SiamGPT, provides feedback, or engages with our community. Your voices and insights are invaluable to us.

Every interaction helps us grow and improve as a Thai AI company, driving our mission to build a larger, stronger, and more capable AI ecosystem for Thailand.

Thank you for being part of this journey! ๐Ÿ‡น๐Ÿ‡ญ โœจ

Downloads last month
18
Safetensors
Model size
33B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for siamaids/SiamGPT-32B

Base model

Qwen/Qwen3-32B
Finetuned
(159)
this model