soughtlin
/

CN_EN_Translation_Model

Model card Files Files and versions

soughtlin commited on 26 days ago

Commit

3d9ab29

·

verified ·

1 Parent(s): a0994a7

Create README.md

Files changed (1) hide show

README.md +77 -0

README.md ADDED Viewed

	@@ -0,0 +1,77 @@

+## Training & Inference
+### Data Source
+The dataset includes Small Training (100k), Large Training (10k), Validation (500), and Test (200) sets in `.jsonl` format.
+  * **Download Link:** [Baidu Netdisk](https://pan.baidu.com/s/1TuaGjNvTESt9ZdEQy1BogA?pwd=u9i2).
+  * **Note:** If resources are limited, you may use 10k samples from the Small Training Set, though using the Large Training Set is encouraged.
+Download checkpoints and respective config.yaml, and put them under the directory "runs/train"
+  * **Download Link:**: https://huggingface.co/soughtlin/CN_EN_Translation_Model
+Preprocess the data
+```bash
+python preprocess.py -c config.yaml
+```
+### Evaluation
+Evaluate the model using **Greedy decoding** or **beam search**. Performance is measured using **BLEU-4**.
+Evaluate transformer
+```bash
+python evaluate_transformer.py -c runs/train/transformer/MHA/config.yaml --ckpt runs/train/transformer/MHA/best_model.pt --save_path runs/evaluate --eval_method beam
+```
+Evaluate rnn
+```bash
+python evaluate_rnn.py -c runs/train/rnn/config.yaml --ckpt runs/train/rnn/best_model.pt  --save_path runs/evaluate --eval_method beam
+```
+### Training
+Training Transformer (MHA, MQA, GQA)
+```bash
+python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml
+```
+Training RNN (MHA, MQA, GQA)
+```bash
+python train_tranformer.py -c runs/trian/transformer/MHA/config.yaml
+```
+### Main Results
+**Table 1: Performance of Transformer Variants.**
+| Model Variant         | Decoding Strategy | BLEU Score |
+| --------------------- | ----------------- | ---------- |
+| **Transformer (MHA)** | Greedy Search     | 13.61      |
+|                       | Beam Search       | **14.56**  |
+| Transformer (MQA)     | Greedy Search     | 11.00      |
+|                       | Beam Search       | 12.10      |
+| Transformer (GQA)     | Greedy Search     | 9.57       |
+|                       | Beam Search       | 10.80      |
+**Table 2: Performance of RNN Variants**
+| Alignment Function       | Decoding Strategy | BLEU Score |
+| ------------------------ | ----------------- | ---------- |
+| **Dot Product (dot)**    | Greedy Search     | 8.95       |
+|                          | Beam Search       | 9.44       |
+| Multiplicative (general) | Greedy Search     | 9.20       |
+|                          | Beam Search       | 9.88       |
+| Additive (concat)        | Greedy Search     | 10.44      |
+|                          | Beam Search       | 10.09      |