| datasets: | |
| - roneneldan/TinyStories | |
| metrics: | |
| - babylm | |
| Basemodel: roBERTa | |
| Configs: | |
| Vocab size: 10,000 | |
| Hidden size: 512 | |
| Max position embeddings: 512 | |
| Number of layers: 2 | |
| Number of heads: 4 | |
| Window size: 256 | |
| Intermediate-size: 1024 | |
| Results: | |
| - Task: glue | |
| Score: 57.89 | |
| Confidence Interval: [57.28, 58.48] | |
| - Task: blimp | |
| Score: 58.26 | |
| Confidence Interval: [57.71, 58.75] | |