soughtlin
/

CN_EN_Translation_Model

Model card Files Files and versions

CN_EN_Translation_Model / README.md

soughtlin's picture

Update README.md

4cc9311 verified 20 days ago

|

history blame contribute delete

2.44 kB

	## Training & Inference

	### Data Source

	The dataset includes Small Training (100k), Large Training (10k), Validation (500), and Test (200) sets in `.jsonl` format.

	* Download Link: [Baidu Netdisk](https://pan.baidu.com/s/1TuaGjNvTESt9ZdEQy1BogA?pwd=u9i2)
	* Note: You can preprocess the data by preprocess.py or directly use "data/processed_nltk_100k"

	Download checkpoints and respective config.yaml, and put them under the directory "runs/train"

	* Download Link:: https://huggingface.co/soughtlin/CN_EN_Translation_Model

	Preprocess the data

	```bash
	python preprocess.py -c config.yaml
	```

	### Evaluation

	Evaluate the model using Greedy decoding or beam search. Performance is measured using BLEU-4.

	Evaluate transformer

	```bash
	python evaluate_transformer.py -c runs/train/transformer/MHA/config.yaml --ckpt runs/train/transformer/MHA/best_model.pt --save_path runs/evaluate --eval_method beam
	```

	Evaluate rnn

	```bash
	python evaluate_rnn.py -c runs/train/rnn/config.yaml --ckpt runs/train/rnn/best_model.pt --save_path runs/evaluate --eval_method beam
	```

	### Training

	Training Transformer (MHA, MQA, GQA)

	```bash
	python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml
	```

	Training RNN (MHA, MQA, GQA)

	```bash
	python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml
	```


	### Main Results

	Table 1: Performance of Transformer Variants.
	\| Model Variant \| Decoding Strategy \| BLEU Score \|
	\| --------------------- \| ----------------- \| ---------- \|
	\| Transformer (MHA) \| Greedy Search \| 13.61 \|
	\| \| Beam Search \| 14.56 \|
	\| Transformer (MQA) \| Greedy Search \| 11.00 \|
	\| \| Beam Search \| 12.10 \|
	\| Transformer (GQA) \| Greedy Search \| 9.57 \|
	\| \| Beam Search \| 10.80 \|


	Table 2: Performance of RNN Variants
	\| Alignment Function \| Decoding Strategy \| BLEU Score \|
	\| ------------------------ \| ----------------- \| ---------- \|
	\| Dot Product (dot) \| Greedy Search \| 8.95 \|
	\| \| Beam Search \| 9.44 \|
	\| Multiplicative (general) \| Greedy Search \| 9.20 \|
	\| \| Beam Search \| 9.88 \|
	\| Additive (concat) \| Greedy Search \| 10.44 \|
	\| \| Beam Search \| 10.09 \|