license_name: openmdw license_link: LICENSE datasets:
- fka/awesome metrics:
- accuracy
- character pipeline_tag: text-classification
Introduction
We are excited to introduce Seed-X, a powerful series of open-source multilingual translation language models, including an instruction model, a reinforcement learning model, and a reward model. It pushes the boundaries of translation capabilities within 7 billion parameters. We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications:
- Exceptional translation capabilities: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics.
- Deployment and inference-friendly: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference.
- Broad domain coverage: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment.

This repo contains the Seed-X-Instruct model, with the following features:
- Type: Causal language models
- Training Stage: Pretraining & Post-training
- Support: Multilingual translation among 28 languages
| Languages | Abbr. | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. |
|---|---|---|---|---|---|---|---|
| Arabic | ar | French | fr | Malay | ms | Russian | ru |
| Czech | cs | Croatian | hr | Norwegian Bokmal | nb | Swedish | sv |
| Danish | da | Hungarian | hu | Dutch | nl | Thai | th |
| German | de | Indonesian | id | Norwegian | no | Turkish | tr |
| English | en | Italian | it | Polish | pl | Ukrainian | uk |
| Spanish | es | Japanese | ja | Portuguese | pt | Vietnamese | vi |
| Finnish | fi | Korean | ko | Romanian | ro | Chinese | zh |
Model Downloads
| Model Name | Description | Download |
|---|---|---|
| 👉 Seed-X-Instruct | Instruction-tuned for alignment with user intent. | 🤗 Model |
| Seed-X-PPO | RL trained to boost translation capabilities. | 🤗 Model |
| Seed-X-RM | Reward model to evaluate the quality of translation. | 🤗 Model |
Quickstart
Here is a simple example demonstrating how to load the model and perform translation using vllm
from vllm import LLM, SamplingParams
model = LLM(model=model_path,
max_num_seqs=512,
tensor_parallel_size=8,
enable_prefix_caching=True,
gpu_memory_utilization=0.95)
messages = [
"Translate the following English sentence :\nMay the force be with you <zh>", # without CoT
"Translate the following English sentence and explain it in detail:\nMay the force be with you <zh>" # with CoT
]
results = model.generate(messages, decoding_params)
responses = [res.outputs[0].text.strip() for res in results]
print(responses)
Evaluation
We evaluated Seed-X on a diverse set