ubaada
/

original-transformer

Text Generation

original_transformer

text2text-generation

Model card Files Files and versions

ubaada commited on Nov 11, 2024

Commit

09ea48b

·

verified ·

1 Parent(s): 84d0908

Update README.md

Files changed (1) hide show

README.md +13 -2

README.md CHANGED Viewed

@@ -30,14 +30,25 @@ tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spac
 |----------------------|-------------------------------------------------------------------------------------------------|
 | Dataset              | WMT14-de-en                                                                                     |
 | Translation Pairs    | 4.5M (83M tokens total)                                                                         |
-| Epochs               | 25                                                                                              |
 | Batch Size           | 16                                                                                              |
 | Accumulation Batch   | 8                                                                                               |
 | Effective Batch Size | 128 (16 * 8)                                                                                    |
 | Training Script      | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py)             |
 | Optimiser            | Adam (learning rate = 0.0001)                                                                   |
 | Loss Type            | Cross Entropy |
-| Final Test Loss      | 1.9 |
 | GPU.                 | RTX 4070 (12GB) |

 |----------------------|-------------------------------------------------------------------------------------------------|
 | Dataset              | WMT14-de-en                                                                                     |
 | Translation Pairs    | 4.5M (83M tokens total)                                                                         |
+| Epochs               | 24                                                                                              |
 | Batch Size           | 16                                                                                              |
 | Accumulation Batch   | 8                                                                                               |
 | Effective Batch Size | 128 (16 * 8)                                                                                    |
 | Training Script      | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py)             |
 | Optimiser            | Adam (learning rate = 0.0001)                                                                   |
 | Loss Type            | Cross Entropy |
+| Final Test Loss      | 1.87 |
 | GPU.                 | RTX 4070 (12GB) |
+<p align="center" style="width:500px;">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/0p4eEHiYFaeaibjk_Rf1y.png" />
+</p>
+## Results
+<p align="center" style="width:500px;">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/Gip1Ox-M1_z3qdafGGh3-.png" />
+</p>