NiuTrans
/

LaMaTE

Model card Files Files and versions

LaMaTE / README.md

luoyingfeng's picture

Update README.md

a1b7c20 verified 12 months ago

|

history blame contribute delete

2.42 kB

	---
	license: mit
	datasets:
	- NiuTrans/ComMT
	language:
	- en
	- zh
	- de
	- cs
	metrics:
	- bleu
	- comet
	base_model:
	- meta-llama/Meta-Llama-3-8B
	pipeline_tag: translation
	---

	# LaMaTE

	- Github: https://github.com/NiuTrans/LaMaTE/
	- Paper: https://arxiv.org/abs/2503.06594

	## Model Description

	LaMaTE is a high-performance and efficient translation model developed based on Llama-3-8B.
	It utilizes large language models (LLMs) as machine translation(MT) encoders, paired with lightweight decoders.
	The model integrates an adapter to bridge LLM representations with the decoder, employing a two-stage training strategy to enhance performance and efficiency.

	Key Features of LaMaTE
	- Enhanced Efficiency: Offers 2.4× to 6.5× faster decoding speeds.
	- Reduced Memory Usage: Reduces KV cache memory consumption by 75%.
	- Competitive Performance: Exhibits robust performance across diverse translation tasks.


	## A Quick Start
	For more detailed usage, please refer to [github](https://github.com/NiuTrans/LaMaTE)

	Note: Our implementation is developed with transformers v4.39.2.
	We recommend installing this version for best compatibility.

	To deploy LaMaTE, utilize the ```from_pretrained()``` method followed by the ```generate()``` method for immediate use:

	```python
	from modeling_llama_seq2seq import LlamaCrossAttentionEncDec
	from transformers import AutoTokenizer, AutoConfig

	tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
	config = AutoConfig.from_pretrained(model_name_or_path, trust_remote_code=True)
	model = LlamaCrossAttentionEncDec.from_pretrained(model_name_or_path, config=config)

	prompt = "Translate the following text from English into Chinese.\nEnglish: The harder you work at it, the more progress you will make.\nChinese: ",
	input_ids = tokenizer(prompt, return_tensors="pt")
	outputs_tokenized = model.generate(
	**input_ids,
	num_beams=5,
	do_sample=False
	)
	outputs = tokenizer.batch_decode(outputs_tokenized, skip_special_tokens=True)
	print(outputs)
	```


	## Citation

	```
	@misc{luoyf2025lamate,
	title={Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation},
	author={Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, Jingbo Zhu},
	year={2025},
	eprint={2503.06594},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```