A pretrained ELECTRA-Tiny model. Used pretraining data from the 2024 BabyLM Challenge. Used to perform text classification on the Web of Science Dataset WOS-46985 but this model is not currently finetuned for that task. Also evaluated on BLiMP using the 2024 BabyLM evaluation pipeline.
Training
Used pretraining pipeline as defined in this repository.
Hyperparameters
- Epochs: 10
- Batch size: 8
- Learning rate: 1e-4
- Optimizer: AdamW
Resources Used
- Compute: AWS Sagemaker ml.g4dn.xlarge
- Time: About 70 hours or 3 days
Evaluation
Web of Science (WOS)
Used WOS pipeline as defined in this repository.
Results
- 76% accuracy on the last epoch of the test set.
Hyperparameters
- Epochs: 3
- Batch size: 64
- Learning rate: 2e-5
- Optimizer: AdamW
- Max Length: 128
- Parameter Freezing: None
Resources Used
- Compute: AWS Sagemaker ml.g4dn.xlarge
- Time: About 5 minutes
BLiMP
Results
- blimp_supplement accuracy: 49.79%
- blimp_filtered accuracy: 50.65%
- See blimp_results for a detailed breakdown on subtasks.
Hyperparameters
- Epochs: 1
- Script modified for masked LMs
Resources Used
- Compute: arm64 MacOS
- Time: About 1 hour
- Downloads last month
- 63
Model tree for bakirgrbic/electra-tiny
Base model
bsu-slim/electra-tiny