| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | base_model: |
| | - bsu-slim/electra-tiny |
| | pipeline_tag: text-classification |
| | library_name: transformers |
| | --- |
| | |
| | A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Used pretraining [data](https://osf.io/5mk3x) |
| | from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used to perform text classification |
| | on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently finetuned |
| | for that task. Also evaluated on BLiMP using the [2024 BabyLM evaluation pipeline](https://github.com/babylm/evaluation-pipeline-2024). |
| |
|
| |
|
| | # Training |
| | Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm). |
| |
|
| | ## Hyperparameters |
| | - Epochs: 10 |
| | - Batch size: 8 |
| | - Learning rate: 1e-4 |
| | - Optimizer: AdamW |
| |
|
| | ## Resources Used |
| | - Compute: AWS Sagemaker ml.g4dn.xlarge |
| | - Time: About 70 hours or 3 days |
| |
|
| |
|
| | # Evaluation |
| |
|
| | ## Web of Science (WOS) |
| | Used WOS pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm). |
| |
|
| | ### Results |
| | - 76% accuracy on the last epoch of the test set. |
| |
|
| | ### Hyperparameters |
| | - Epochs: 3 |
| | - Batch size: 64 |
| | - Learning rate: 2e-5 |
| | - Optimizer: AdamW |
| | - Max Length: 128 |
| | - Parameter Freezing: None |
| |
|
| | ### Resources Used |
| | - Compute: AWS Sagemaker ml.g4dn.xlarge |
| | - Time: About 5 minutes |
| |
|
| |
|
| | ## BLiMP |
| |
|
| | ### Results |
| | - blimp_supplement accuracy: 49.79% |
| | - blimp_filtered accuracy: 50.65% |
| | - See [blimp_results](./blimp_results) for a detailed breakdown on subtasks. |
| |
|
| | ### Hyperparameters |
| | - Epochs: 1 |
| | - Script modified for masked LMs |
| |
|
| | ### Resources Used |
| | - Compute: arm64 MacOS |
| | - Time: About 1 hour |