Translation
Transformers
Safetensors
English
Thai
qwen3_5
image-text-to-text
thai
english
instruction-following
machine-translation
Instructions to use iapp/ChindaMT-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use iapp/ChindaMT-4B with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="iapp/ChindaMT-4B")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("iapp/ChindaMT-4B") model = AutoModelForImageTextToText.from_pretrained("iapp/ChindaMT-4B") - Notebooks
- Google Colab
- Kaggle
ChindaMT-4B
ChindaMT-4B is an open-weight Thai-English machine translation model fine-tuned from Qwen/Qwen3.5-4B. It supports plain translation and instruction-following translation with auxiliary rules in the prompt.
- Task: Thai-English machine translation with instruction-following
- Base model: Qwen3.5-4B
- Parameter count: 4B
- License: Apache-2.0
Prompting
Plain translation. Same template for both directions; swap the language line and the source-tag:
Translate English to Thai.
EN: The weather is nice today.
Translate Thai to English.
TH: วันนี้อากาศดีมาก
With instruction following. Add a Rules: block between the language line and the source line. Rules are free-form text:
Translate English to Thai.
Rules:
- Return only the translated text
- Use a clear, professional tone in Thai
- Keep all numerals in Arabic digits
EN: <source text>
Inference
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "iapp/ChindaMT-4B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto"
)
prompt = "Translate English to Thai.\n\nEN: The weather is nice today."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(
**inputs, max_new_tokens=1024, temperature=0.01, top_p=0.7, top_k=20,
repetition_penalty=1.05, do_sample=True,
)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Evaluation datasets
The evaluation suites used during development will be released soon:
- iapp/ChindaMT-CoreEval: 5-domain primary evaluation
- iapp/ChindaMT-BroadEval: 10-domain generalization check
Limitations
- Thai-English only.
- Behavior on out-of-domain or paragraph-length inputs is not comprehensively characterized.
iApp AI Research
- Downloads last month
- 20