Translation
Safetensors
llama
luoyingfeng commited on
Commit
a1b7c20
·
verified ·
1 Parent(s): e19e0dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -1
README.md CHANGED
@@ -13,4 +13,62 @@ metrics:
13
  base_model:
14
  - meta-llama/Meta-Llama-3-8B
15
  pipeline_tag: translation
16
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  base_model:
14
  - meta-llama/Meta-Llama-3-8B
15
  pipeline_tag: translation
16
+ ---
17
+
18
+ # LaMaTE
19
+
20
+ - **Github:** https://github.com/NiuTrans/LaMaTE/
21
+ - **Paper:** https://arxiv.org/abs/2503.06594
22
+
23
+ ## Model Description
24
+
25
+ LaMaTE is a high-performance and efficient translation model developed based on Llama-3-8B.
26
+ It utilizes large language models (LLMs) as machine translation(MT) encoders, paired with lightweight decoders.
27
+ The model integrates an adapter to bridge LLM representations with the decoder, employing a two-stage training strategy to enhance performance and efficiency.
28
+
29
+ **Key Features of LaMaTE**
30
+ - Enhanced Efficiency: Offers 2.4× to 6.5× faster decoding speeds.
31
+ - Reduced Memory Usage: Reduces KV cache memory consumption by 75%.
32
+ - Competitive Performance: Exhibits robust performance across diverse translation tasks.
33
+
34
+
35
+ ## A Quick Start
36
+ For more detailed usage, please refer to [github](https://github.com/NiuTrans/LaMaTE)
37
+
38
+ **Note:** Our implementation is developed with transformers v4.39.2.
39
+ We recommend installing this version for best compatibility.
40
+
41
+ To deploy LaMaTE, utilize the ```from_pretrained()``` method followed by the ```generate()``` method for immediate use:
42
+
43
+ ```python
44
+ from modeling_llama_seq2seq import LlamaCrossAttentionEncDec
45
+ from transformers import AutoTokenizer, AutoConfig
46
+
47
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
48
+ config = AutoConfig.from_pretrained(model_name_or_path, trust_remote_code=True)
49
+ model = LlamaCrossAttentionEncDec.from_pretrained(model_name_or_path, config=config)
50
+
51
+ prompt = "Translate the following text from English into Chinese.\nEnglish: The harder you work at it, the more progress you will make.\nChinese: ",
52
+ input_ids = tokenizer(prompt, return_tensors="pt")
53
+ outputs_tokenized = model.generate(
54
+ **input_ids,
55
+ num_beams=5,
56
+ do_sample=False
57
+ )
58
+ outputs = tokenizer.batch_decode(outputs_tokenized, skip_special_tokens=True)
59
+ print(outputs)
60
+ ```
61
+
62
+
63
+ ## Citation
64
+
65
+ ```
66
+ @misc{luoyf2025lamate,
67
+ title={Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation},
68
+ author={Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, Jingbo Zhu},
69
+ year={2025},
70
+ eprint={2503.06594},
71
+ archivePrefix={arXiv},
72
+ primaryClass={cs.CL}
73
+ }
74
+ ```