File size: 1,656 Bytes
6853176 cfd599b 6853176 f7c6e15 6853176 f7c6e15 6853176 f7c6e15 6853176 f7c6e15 6853176 f7c6e15 6853176 f7c6e15 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | ---
license: apache-2.0
language:
- en
base_model:
- bsu-slim/electra-tiny
pipeline_tag: text-classification
library_name: transformers
---
A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Used pretraining [data](https://osf.io/5mk3x)
from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used to perform text classification
on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently finetuned
for that task. Also evaluated on BLiMP using the [2024 BabyLM evaluation pipeline](https://github.com/babylm/evaluation-pipeline-2024).
# Training
Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
## Hyperparameters
- Epochs: 10
- Batch size: 8
- Learning rate: 1e-4
- Optimizer: AdamW
## Resources Used
- Compute: AWS Sagemaker ml.g4dn.xlarge
- Time: About 70 hours or 3 days
# Evaluation
## Web of Science (WOS)
Used WOS pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
### Results
- 76% accuracy on the last epoch of the test set.
### Hyperparameters
- Epochs: 3
- Batch size: 64
- Learning rate: 2e-5
- Optimizer: AdamW
- Max Length: 128
- Parameter Freezing: None
### Resources Used
- Compute: AWS Sagemaker ml.g4dn.xlarge
- Time: About 5 minutes
## BLiMP
### Results
- blimp_supplement accuracy: 49.79%
- blimp_filtered accuracy: 50.65%
- See [blimp_results](./blimp_results) for a detailed breakdown on subtasks.
### Hyperparameters
- Epochs: 1
- Script modified for masked LMs
### Resources Used
- Compute: arm64 MacOS
- Time: About 1 hour |