Update README.md
Browse files
README.md
CHANGED
|
@@ -36,6 +36,16 @@ Computer Games by Mille Mei Zhen Loo & Gert Luzkov.
|
|
| 36 |
|
| 37 |
## Model architecture
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
## Data
|
| 41 |
|
|
@@ -44,6 +54,7 @@ The input data used for this model was the [Context_window_256](https://huggingf
|
|
| 44 |
## Model testing
|
| 45 |
|
| 46 |
|
|
|
|
| 47 |
## Recommendations for further usage
|
| 48 |
|
| 49 |
|
|
|
|
| 36 |
|
| 37 |
## Model architecture
|
| 38 |
|
| 39 |
+
| **Component** | **Value** |
|
| 40 |
+
|-----------------------------------|-----------------------------------------|
|
| 41 |
+
| Context window size | 256 |
|
| 42 |
+
| Transformer layers | 4 |
|
| 43 |
+
| Attention heads | 1 |
|
| 44 |
+
| Transformer feedforward dimension | 176 |
|
| 45 |
+
| Loss function | Binary Cross Entropy (BCEWithLogitLoss) |
|
| 46 |
+
| Optimiser | AdamW (learning rate = 10<sup>-4</sup>) |
|
| 47 |
+
| Scheduler | StepLR (gamma = 0.5, step size = 10) |
|
| 48 |
+
| Batch size | 128 |
|
| 49 |
|
| 50 |
## Data
|
| 51 |
|
|
|
|
| 54 |
## Model testing
|
| 55 |
|
| 56 |
|
| 57 |
+
|
| 58 |
## Recommendations for further usage
|
| 59 |
|
| 60 |
|