| --- | |
| license_name: openmdw | |
| license_link: LICENSE | |
| datasets: | |
| - fka/awesome | |
| metrics: | |
| - accuracy | |
| - character | |
| pipeline_tag: text-classification | |
| ## Introduction | |
| We are excited to introduce **Seed-X**, a powerful series of open-source multilingual translation language models, including an instruction model, a reinforcement learning model, and a reward model. It pushes the boundaries of translation capabilities within 7 billion parameters. | |
| We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications: | |
| * **Exceptional translation capabilities**: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics. | |
| * **Deployment and inference-friendly**: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference. | |
| * **Broad domain coverage**: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment. | |
|  | |
| This repo contains the **Seed-X-Instruct** model, with the following features: | |
| * Type: Causal language models | |
| * Training Stage: Pretraining & Post-training | |
| * Support: Multilingual translation among 28 languages | |
| | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. | | |
| | ----------- | ----------- |-----------|-----------|-----------|-----------| -----------|-----------| | |
| |Arabic | ar |French | fr | Malay | ms | Russian | ru | | |
| |Czech | cs |Croatian | hr | Norwegian Bokmal | nb | Swedish | sv | | |
| |Danish | da |Hungarian | hu | Dutch | nl | Thai | th | | |
| |German | de |Indonesian | id | Norwegian | no | Turkish | tr | | |
| |English | en |Italian | it | Polish | pl | Ukrainian | uk | | |
| |Spanish | es |Japanese | ja | Portuguese | pt | Vietnamese | vi | | |
| |Finnish | fi |Korean | ko | Romanian | ro | Chinese | zh | | |
| ## Model Downloads | |
| | Model Name | Description | Download | | |
| | ----------- | ----------- |----------- | |
| | 👉 **Seed-X-Instruct** | Instruction-tuned for alignment with user intent. |🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-Instruct-7B)| | |
| | Seed-X-PPO | RL trained to boost translation capabilities. | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B)| | |
| |Seed-X-RM | Reward model to evaluate the quality of translation.| 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B)| | |
| ## Quickstart | |
| Here is a simple example demonstrating how to load the model and perform translation using ```vllm``` | |
| ```python | |
| from vllm import LLM, SamplingParams | |
| model = LLM(model=model_path, | |
| max_num_seqs=512, | |
| tensor_parallel_size=8, | |
| enable_prefix_caching=True, | |
| gpu_memory_utilization=0.95) | |
| messages = [ | |
| "Translate the following English sentence :\nMay the force be with you <zh>", # without CoT | |
| "Translate the following English sentence and explain it in detail:\nMay the force be with you <zh>" # with CoT | |
| ] | |
| results = model.generate(messages, decoding_params) | |
| responses = [res.outputs[0].text.strip() for res in results] | |
| print(responses) | |
| ``` | |
| ## Evaluation | |
| We evaluated Seed-X on a diverse set |