--- license: apache-2.0 language: - en base_model: - bsu-slim/electra-tiny pipeline_tag: text-classification library_name: transformers --- A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Used pretraining [data](https://osf.io/5mk3x) from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used to perform text classification on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently finetuned for that task. Also evaluated on BLiMP using the [2024 BabyLM evaluation pipeline](https://github.com/babylm/evaluation-pipeline-2024). # Training Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm). ## Hyperparameters - Epochs: 10 - Batch size: 8 - Learning rate: 1e-4 - Optimizer: AdamW ## Resources Used - Compute: AWS Sagemaker ml.g4dn.xlarge - Time: About 70 hours or 3 days # Evaluation ## Web of Science (WOS) Used WOS pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm). ### Results - 76% accuracy on the last epoch of the test set. ### Hyperparameters - Epochs: 3 - Batch size: 64 - Learning rate: 2e-5 - Optimizer: AdamW - Max Length: 128 - Parameter Freezing: None ### Resources Used - Compute: AWS Sagemaker ml.g4dn.xlarge - Time: About 5 minutes ## BLiMP ### Results - blimp_supplement accuracy: 49.79% - blimp_filtered accuracy: 50.65% - See [blimp_results](./blimp_results) for a detailed breakdown on subtasks. ### Hyperparameters - Epochs: 1 - Script modified for masked LMs ### Resources Used - Compute: arm64 MacOS - Time: About 1 hour