Delete README.MD
Browse files
README.MD
DELETED
|
@@ -1,17 +0,0 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
datasets:
|
| 4 |
-
- allenai/dolma
|
| 5 |
-
---
|
| 6 |
-
# Training run to compare Mixture-of-Depths, Bitnet
|
| 7 |
-
[Wandb Report](https://api.wandb.ai/links/tulasiram/pw76q41i)
|
| 8 |
-
|
| 9 |
-

|
| 10 |
-
|
| 11 |
-
#### 4 Models trained for 100k steps on Dolma
|
| 12 |
-
- OLMo-50M - 50M parameter model
|
| 13 |
-
- OLMo-50M-bitlinear - 50M parameter bitnet model
|
| 14 |
-
- OLMo-50M-mod - 50M parameter mixture-of-depths model
|
| 15 |
-
- OLMo-50M-mod-bitlinear - 50M parameter mixture-of-depths bitnet model
|
| 16 |
-
|
| 17 |
-
Repo has zip files which include training states and other files for each model. I am not the author of the mixture-of-depths implementation, it can be found [here](https://github.com/thepowerfuldeez/OLMo)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|