Training & Inference

Data Source

The dataset includes Small Training (100k), Large Training (10k), Validation (500), and Test (200) sets in .jsonl format.

Download Link: Baidu Netdisk
Note: You can preprocess the data by preprocess.py or directly use "data/processed_nltk_100k"

Download checkpoints and respective config.yaml, and put them under the directory "runs/train"

Download Link:: https://huggingface.co/soughtlin/CN_EN_Translation_Model

Preprocess the data

python preprocess.py -c config.yaml

Evaluation

Evaluate the model using Greedy decoding or beam search. Performance is measured using BLEU-4.

Evaluate transformer

python evaluate_transformer.py -c runs/train/transformer/MHA/config.yaml --ckpt runs/train/transformer/MHA/best_model.pt --save_path runs/evaluate --eval_method beam

Evaluate rnn

python evaluate_rnn.py -c runs/train/rnn/config.yaml --ckpt runs/train/rnn/best_model.pt  --save_path runs/evaluate --eval_method beam

Training

Training Transformer (MHA, MQA, GQA)

python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml

Training RNN (MHA, MQA, GQA)

python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml

Main Results

Table 1: Performance of Transformer Variants.

Model Variant	Decoding Strategy	BLEU Score
Transformer (MHA)	Greedy Search	13.61
	Beam Search	14.56
Transformer (MQA)	Greedy Search	11.00
	Beam Search	12.10
Transformer (GQA)	Greedy Search	9.57
	Beam Search	10.80

Table 2: Performance of RNN Variants

Alignment Function	Decoding Strategy	BLEU Score
Dot Product (dot)	Greedy Search	8.95
	Beam Search	9.44
Multiplicative (general)	Greedy Search	9.20
	Beam Search	9.88
Additive (concat)	Greedy Search	10.44
	Beam Search	10.09

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support