Translation
Safetensors
mistral
File size: 3,869 Bytes
8aa8dbb
 
 
4a6b055
 
 
 
 
 
8aa8dbb
 
aea75de
8aa8dbb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4a6b055
 
8aa8dbb
 
 
5193fa2
8aa8dbb
 
 
 
 
4a6b055
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license_name: openmdw
license_link: LICENSE
datasets:
- fka/awesome
metrics:
- accuracy
- character
pipeline_tag: text-classification

## Introduction
We are excited to introduce **Seed-X**, a powerful series of open-source multilingual translation language models, including an instruction model, a reinforcement learning model, and a reward model. It pushes the boundaries of translation capabilities within 7 billion parameters.
We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications:
* **Exceptional translation capabilities**: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics.
* **Deployment and inference-friendly**: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference.
* **Broad domain coverage**: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment.
![performance](imgs/model_comparsion.png)

This repo contains the **Seed-X-Instruct** model, with the following features:
* Type: Causal language models
* Training Stage: Pretraining & Post-training
* Support: Multilingual translation among 28 languages

| Languages  | Abbr. | Languages  | Abbr. | Languages  | Abbr. | Languages  | Abbr. |
| ----------- | ----------- |-----------|-----------|-----------|-----------| -----------|-----------|
|Arabic | ar  |French              | fr  | Malay            |  ms  | Russian                   | ru                                | 
|Czech  | cs  |Croatian            | hr  | Norwegian Bokmal |  nb                   | Swedish  | sv                | 
|Danish  | da |Hungarian           | hu  |  Dutch           | nl                    |  Thai                      | th      | 
|German  | de |Indonesian          | id  | Norwegian        | no | Turkish                   | tr                   | 
|English | en |Italian             | it  | Polish           | pl  | Ukrainian                 | uk           | 
|Spanish | es |Japanese            | ja  | Portuguese       | pt   | Vietnamese                | vi                   | 
|Finnish | fi |Korean              | ko  | Romanian         | ro                  | Chinese                   | zh  | 

## Model Downloads
| Model Name  | Description | Download |
| ----------- | ----------- |-----------
| 👉 **Seed-X-Instruct**  | Instruction-tuned for alignment with user intent. |🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-Instruct-7B)|
| Seed-X-PPO | RL trained to boost translation capabilities.     | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B)|
|Seed-X-RM | Reward model to evaluate the quality of translation.|  🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B)| 

## Quickstart
Here is a simple example demonstrating how to load the model and perform translation using ```vllm```
```python
from vllm import LLM, SamplingParams
model = LLM(model=model_path,
            max_num_seqs=512,
            tensor_parallel_size=8,
            enable_prefix_caching=True, 
            gpu_memory_utilization=0.95)

messages = [
    "Translate the following English sentence :\nMay the force be with you <zh>", # without CoT
    "Translate the following English sentence  and explain it in detail:\nMay the force be with you <zh>" # with CoT
]


results = model.generate(messages, decoding_params)
responses = [res.outputs[0].text.strip() for res in results]

print(responses)
```
## Evaluation
We evaluated Seed-X on a diverse set