File size: 2,437 Bytes
3d9ab29
 
 
 
 
 
9b43e02
 
3d9ab29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
773c1a7
3d9ab29
 
 
 
 
 
 
 
 
 
773c1a7
3d9ab29
 
 
773c1a7
3d9ab29
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
## Training & Inference

### Data Source

The dataset includes Small Training (100k), Large Training (10k), Validation (500), and Test (200) sets in `.jsonl` format.

  * **Download Link:** [Baidu Netdisk](https://pan.baidu.com/s/1TuaGjNvTESt9ZdEQy1BogA?pwd=u9i2)
  * **Note:** You can preprocess the data by preprocess.py or directly use "data/processed_nltk_100k"

Download checkpoints and respective config.yaml, and put them under the directory "runs/train"

  * **Download Link:**: https://huggingface.co/soughtlin/CN_EN_Translation_Model

Preprocess the data

```bash
python preprocess.py -c config.yaml
```

### Evaluation

Evaluate the model using **Greedy decoding** or **beam search**. Performance is measured using **BLEU-4**.

Evaluate transformer

```bash
python evaluate_transformer.py -c runs/train/transformer/MHA/config.yaml --ckpt runs/train/transformer/MHA/best_model.pt --save_path runs/evaluate --eval_method beam  
```

Evaluate rnn

```bash
python evaluate_rnn.py -c runs/train/rnn/config.yaml --ckpt runs/train/rnn/best_model.pt  --save_path runs/evaluate --eval_method beam  
```

### Training

Training Transformer (MHA, MQA, GQA)

```bash
python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml
```

Training RNN (MHA, MQA, GQA)

```bash
python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml
```


### Main Results

**Table 1: Performance of Transformer Variants.** 
| Model Variant         | Decoding Strategy | BLEU Score |
| --------------------- | ----------------- | ---------- |
| Transformer (MHA)     | Greedy Search     | 13.61      |
|                       | Beam Search       | **14.56**  |
| Transformer (MQA)     | Greedy Search     | 11.00      |
|                       | Beam Search       | 12.10      |
| Transformer (GQA)     | Greedy Search     | 9.57       |
|                       | Beam Search       | 10.80      |


**Table 2: Performance of RNN Variants** 
| Alignment Function       | Decoding Strategy | BLEU Score |
| ------------------------ | ----------------- | ---------- |
| Dot Product (dot)        | Greedy Search     | 8.95       |
|                          | Beam Search       | 9.44       |
| Multiplicative (general) | Greedy Search     | 9.20       |
|                          | Beam Search       | 9.88       |
| Additive (concat)        | Greedy Search     | **10.44**  |
|                          | Beam Search       | 10.09      |