|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- NiuTrans/ComMT |
|
|
language: |
|
|
- en |
|
|
- zh |
|
|
- de |
|
|
- cs |
|
|
metrics: |
|
|
- bleu |
|
|
- comet |
|
|
base_model: |
|
|
- meta-llama/Meta-Llama-3-8B |
|
|
pipeline_tag: translation |
|
|
--- |
|
|
|
|
|
# LaMaTE |
|
|
|
|
|
- **Github:** https://github.com/NiuTrans/LaMaTE/ |
|
|
- **Paper:** https://arxiv.org/abs/2503.06594 |
|
|
|
|
|
## Model Description |
|
|
|
|
|
LaMaTE is a high-performance and efficient translation model developed based on Llama-3-8B. |
|
|
It utilizes large language models (LLMs) as machine translation(MT) encoders, paired with lightweight decoders. |
|
|
The model integrates an adapter to bridge LLM representations with the decoder, employing a two-stage training strategy to enhance performance and efficiency. |
|
|
|
|
|
**Key Features of LaMaTE** |
|
|
- Enhanced Efficiency: Offers 2.4× to 6.5× faster decoding speeds. |
|
|
- Reduced Memory Usage: Reduces KV cache memory consumption by 75%. |
|
|
- Competitive Performance: Exhibits robust performance across diverse translation tasks. |
|
|
|
|
|
|
|
|
## A Quick Start |
|
|
For more detailed usage, please refer to [github](https://github.com/NiuTrans/LaMaTE) |
|
|
|
|
|
**Note:** Our implementation is developed with transformers v4.39.2. |
|
|
We recommend installing this version for best compatibility. |
|
|
|
|
|
To deploy LaMaTE, utilize the ```from_pretrained()``` method followed by the ```generate()``` method for immediate use: |
|
|
|
|
|
```python |
|
|
from modeling_llama_seq2seq import LlamaCrossAttentionEncDec |
|
|
from transformers import AutoTokenizer, AutoConfig |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) |
|
|
config = AutoConfig.from_pretrained(model_name_or_path, trust_remote_code=True) |
|
|
model = LlamaCrossAttentionEncDec.from_pretrained(model_name_or_path, config=config) |
|
|
|
|
|
prompt = "Translate the following text from English into Chinese.\nEnglish: The harder you work at it, the more progress you will make.\nChinese: ", |
|
|
input_ids = tokenizer(prompt, return_tensors="pt") |
|
|
outputs_tokenized = model.generate( |
|
|
**input_ids, |
|
|
num_beams=5, |
|
|
do_sample=False |
|
|
) |
|
|
outputs = tokenizer.batch_decode(outputs_tokenized, skip_special_tokens=True) |
|
|
print(outputs) |
|
|
``` |
|
|
|
|
|
|
|
|
## Citation |
|
|
|
|
|
``` |
|
|
@misc{luoyf2025lamate, |
|
|
title={Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation}, |
|
|
author={Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, Jingbo Zhu}, |
|
|
year={2025}, |
|
|
eprint={2503.06594}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL} |
|
|
} |
|
|
``` |
|
|
|