Translation
Transformers
Safetensors
qwen3
text-generation
text-generation-inference

Update metadata (library_name, pipeline_tag, language) and add paper abstract

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -1,4 +1,6 @@
1
  ---
 
 
2
  language:
3
  - en
4
  - zh
@@ -60,10 +62,9 @@ language:
60
  - ur
61
  - uz
62
  - yue
63
- base_model:
64
- - Qwen/Qwen3-0.6B-Base
65
  license: apache-2.0
66
- pipeline_tag: translation
 
67
  ---
68
 
69
  ## LMT
@@ -71,7 +72,12 @@ pipeline_tag: translation
71
  - Github: [LMT](https://github.com/NiuTrans/LMT)
72
 
73
  **LMT-60** is a suite of **Chinese-English-centric** MMT models trained on **90B tokens** mixed monolingual and bilingual tokens, covering **60 languages across 234 translation directions** and achieving **SOTA performance** among models with similar language coverage.
74
- We release both the CPT and SFT versions of LMT-60 in four sizes (0.6B/1.7B/4B/8B). All checkpoints are available:
 
 
 
 
 
75
  | Models | Model Link |
76
  |:------------|:------------|
77
  | LMT-60-0.6B-Base | [NiuTrans/LMT-60-0.6B-Base](https://huggingface.co/NiuTrans/LMT-60-0.6B-Base) |
@@ -95,7 +101,9 @@ model_name = "NiuTrans/LMT-60-8B"
95
  tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side='left')
96
  model = AutoModelForCausalLM.from_pretrained(model_name)
97
 
98
- prompt = "Translate the following text from English into Chinese.\nEnglish: The concept came from China where plum blossoms were the flower of choice.\nChinese: "
 
 
99
  messages = [{"role": "user", "content": prompt}]
100
  text = tokenizer.apply_chat_template(
101
  messages,
@@ -105,7 +113,7 @@ text = tokenizer.apply_chat_template(
105
  model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
106
 
107
  generated_ids = model.generate(**model_inputs, max_new_tokens=512, num_beams=5, do_sample=False)
108
- output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
109
 
110
  outputs = tokenizer.batch_decode(output_ids, skip_special_tokens=True)
111
 
@@ -125,12 +133,12 @@ print("response:", outputs)
125
  If you find our paper useful for your research, please kindly cite our paper:
126
  ```bash
127
  @misc{luoyf2025lmt,
128
- title={Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs},
129
  author={Yingfeng Luo, Ziqiang Xu, Yuxuan Ouyang, Murun Yang, Dingyang Lin, Kaiyan Chang, Tong Zheng, Bei Li, Peinan Feng, Quan Du, Tong Xiao, Jingbo Zhu},
130
  year={2025},
131
  eprint={2511.07003},
132
  archivePrefix={arXiv},
133
  primaryClass={cs.CL},
134
- url={https://arxiv.org/abs/2511.07003},
135
  }
136
  ```
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-0.6B-Base
4
  language:
5
  - en
6
  - zh
 
62
  - ur
63
  - uz
64
  - yue
 
 
65
  license: apache-2.0
66
+ pipeline_tag: text-generation
67
+ library_name: transformers
68
  ---
69
 
70
  ## LMT
 
72
  - Github: [LMT](https://github.com/NiuTrans/LMT)
73
 
74
  **LMT-60** is a suite of **Chinese-English-centric** MMT models trained on **90B tokens** mixed monolingual and bilingual tokens, covering **60 languages across 234 translation directions** and achieving **SOTA performance** among models with similar language coverage.
75
+ We release both the CPT and SFT versions of LMT-60 in four sizes (0.6B/1.7B/4B/8B).
76
+
77
+ ## Abstract
78
+ Large language models have significantly advanced Multilingual Machine Translation (MMT), yet the broad language coverage, consistent translation quality, and English-centric bias remain open challenges. To address these challenges, we introduce **LMT**, a suite of **L**arge-scale **M**ultilingual **T**ranslation models centered on both Chinese and English, covering 60 languages and 234 translation directions. During development, we identify a previously overlooked phenomenon of **directional degeneration**, where symmetric multi-way fine-tuning data overemphasize reverse directions (X $\to$ En/Zh), leading to excessive many-to-one mappings and degraded translation quality. We propose **Strategic Downsampling**, a simple yet effective method to mitigate this degeneration. In addition, we design **Parallel Multilingual Prompting (PMP)**, which leverages typologically related auxiliary languages to enhance cross-lingual transfer. Through rigorous data curation and refined adaptation strategies, LMT achieves SOTA performance among models of comparable language coverage, with our 4B model (LMT-60-4B) surpassing the much larger Aya-101-13B and NLLB-54B models by a substantial margin. We release LMT in four sizes (0.6B/1.7B/4B/8B) to catalyze future research and provide strong baselines for inclusive, scalable, and high-quality MMT.
79
+
80
+ All checkpoints are available:
81
  | Models | Model Link |
82
  |:------------|:------------|
83
  | LMT-60-0.6B-Base | [NiuTrans/LMT-60-0.6B-Base](https://huggingface.co/NiuTrans/LMT-60-0.6B-Base) |
 
101
  tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side='left')
102
  model = AutoModelForCausalLM.from_pretrained(model_name)
103
 
104
+ prompt = "Translate the following text from English into Chinese.
105
+ English: The concept came from China where plum blossoms were the flower of choice.
106
+ Chinese: "
107
  messages = [{"role": "user", "content": prompt}]
108
  text = tokenizer.apply_chat_template(
109
  messages,
 
113
  model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
114
 
115
  generated_ids = model.generate(**model_inputs, max_new_tokens=512, num_beams=5, do_sample=False)
116
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
117
 
118
  outputs = tokenizer.batch_decode(output_ids, skip_special_tokens=True)
119
 
 
133
  If you find our paper useful for your research, please kindly cite our paper:
134
  ```bash
135
  @misc{luoyf2025lmt,
136
+ title={Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs},
137
  author={Yingfeng Luo, Ziqiang Xu, Yuxuan Ouyang, Murun Yang, Dingyang Lin, Kaiyan Chang, Tong Zheng, Bei Li, Peinan Feng, Quan Du, Tong Xiao, Jingbo Zhu},
138
  year={2025},
139
  eprint={2511.07003},
140
  archivePrefix={arXiv},
141
  primaryClass={cs.CL},
142
+ url={https://arxiv.org/abs/2511.07003},
143
  }
144
  ```