Commit
·
488b707
1
Parent(s):
b954a57
Update README.md
Browse files
README.md
CHANGED
|
@@ -221,19 +221,16 @@ It was pretrained on mC4 and then finetuned on xP3, P3 or xP3mt.
|
|
| 221 |
## Speeds, Sizes, Times
|
| 222 |
|
| 223 |
// TODO @adarob: Maybe we can push tensorboard on this repo as well
|
| 224 |
-
Training logs:
|
| 225 |
|
| 226 |
-
-
|
| 227 |
-
|
| 228 |
-
|
| 229 |
|
| 230 |
- Number of epochs: 1
|
| 231 |
|
| 232 |
|
| 233 |
## Environmental Impact
|
| 234 |
|
| 235 |
-
// TODO @adarob: Is it possible for you to share some information about the impact of where you trained it?
|
| 236 |
-
|
| 237 |
The evaluation supercomputer, [Jean Zay](http://www.idris.fr/eng/jean-zay/), uses mostly nuclear energy. The heat generated by it is reused for heating campus housing.
|
| 238 |
|
| 239 |
</details>
|
|
|
|
| 221 |
## Speeds, Sizes, Times
|
| 222 |
|
| 223 |
// TODO @adarob: Maybe we can push tensorboard on this repo as well
|
|
|
|
| 224 |
|
| 225 |
+
- Training logs:
|
| 226 |
+
|
| 227 |
+
- Checkpoint size: 51.7GB (Bf16 weights)
|
| 228 |
|
| 229 |
- Number of epochs: 1
|
| 230 |
|
| 231 |
|
| 232 |
## Environmental Impact
|
| 233 |
|
|
|
|
|
|
|
| 234 |
The evaluation supercomputer, [Jean Zay](http://www.idris.fr/eng/jean-zay/), uses mostly nuclear energy. The heat generated by it is reused for heating campus housing.
|
| 235 |
|
| 236 |
</details>
|