Update README.md
Browse files
README.md
CHANGED
|
@@ -62,7 +62,7 @@ A small language model (24.5M parameters) trained on the TinyStories dataset tha
|
|
| 62 |
| **Grammar Score** | 8+/10 | **8.8-10/10** (with post-processing) | β
Exceeded |
|
| 63 |
| **Perplexity** | <20 | **15.7** | β
Excellent |
|
| 64 |
| **Articles per Story** | ~10 | **9 average** | β
Optimal |
|
| 65 |
-
| **Training Time** | <48h | **~
|
| 66 |
|
| 67 |
**Overall Grade:** A (95/100) - Production Ready
|
| 68 |
|
|
@@ -84,7 +84,7 @@ python train_custom_tokenizer.py \
|
|
| 84 |
--max_samples 100000
|
| 85 |
```
|
| 86 |
|
| 87 |
-
### 2. Train Model (
|
| 88 |
```bash
|
| 89 |
# Clean old cache
|
| 90 |
rm -rf ./data/cache/*
|
|
|
|
| 62 |
| **Grammar Score** | 8+/10 | **8.8-10/10** (with post-processing) | β
Exceeded |
|
| 63 |
| **Perplexity** | <20 | **15.7** | β
Excellent |
|
| 64 |
| **Articles per Story** | ~10 | **9 average** | β
Optimal |
|
| 65 |
+
| **Training Time** | <48h | **~6 hours** (RTX 5090) | β
Met |
|
| 66 |
|
| 67 |
**Overall Grade:** A (95/100) - Production Ready
|
| 68 |
|
|
|
|
| 84 |
--max_samples 100000
|
| 85 |
```
|
| 86 |
|
| 87 |
+
### 2. Train Model (6 hours on RTX 5090)
|
| 88 |
```bash
|
| 89 |
# Clean old cache
|
| 90 |
rm -rf ./data/cache/*
|