bakirgrbic
/

electra-tiny

Text Classification

Model card Files Files and versions

bakirgrbic commited on Jul 16, 2025

Commit

f7c6e15

·

verified ·

1 Parent(s): 052269f

v2 done

Files changed (1) hide show

README.md +30 -10

README.md CHANGED Viewed

@@ -11,29 +11,32 @@ library_name: transformers
 A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Pretraining [data](https://osf.io/5mk3x)
 was from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used personally to perform text classification
 on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently fine-tuned
-for that task.
 # Training
 Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
 ## Hyperparameters
-- Epochs: 1
 - Batch size: 8
 - Learning rate: 1e-4
 - Optimizer: AdamW
 ## Resources Used
 - Compute: AWS Sagemaker ml.g4dn.xlarge
-- Time: About 7 hours
-# Evaluation (Web of Science)
-Used wos pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
-## Results
-- 64% accuracy on the last epoch of the test set.
-## Hyperparameters
 - Epochs: 3
 - Batch size: 64
 - Learning rate: 2e-5
@@ -41,6 +44,23 @@ Used wos pipeline as defined in this [repository](https://github.com/bakirgrbic/
 - Max Length: 128
 - Parameter Freezing: None
-## Resources Used
 - Compute: AWS Sagemaker ml.g4dn.xlarge
-- Time: About 5 minutes

 A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Pretraining [data](https://osf.io/5mk3x)
 was from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used personally to perform text classification
 on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently fine-tuned
+for that task. Also evaluated on BLiMP using a pipeline provided by the 2024 BabyLM challenge.
 # Training
 Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
 ## Hyperparameters
+- Epochs: 10
 - Batch size: 8
 - Learning rate: 1e-4
 - Optimizer: AdamW
 ## Resources Used
 - Compute: AWS Sagemaker ml.g4dn.xlarge
+- Time: About 70 hours or 3 days
+# Evaluation
+## Web of Science (WOS)
+Used WOS pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
+### Results
+- 76% accuracy on the last epoch of the test set.
+### Hyperparameters
 - Epochs: 3
 - Batch size: 64
 - Learning rate: 2e-5
 - Max Length: 128
 - Parameter Freezing: None
+### Resources Used
 - Compute: AWS Sagemaker ml.g4dn.xlarge
+- Time: About 5 minutes
+## BLiMP
+Used BLiMP evaluation from the [2024 BabyLM evaluation pipeline repository](https://github.com/babylm/evaluation-pipeline-2024).
+### Results
+- blimp_supplement accuracy: 49.79%
+- blimp_filtered accuracy: 50.65%
+- See [blimp_results](./blimp_results) for a detailed breakdown on subtasks.
+### Hyperparameters
+- Epochs: 1
+- Script modified for masked LMs
+### Resources Used
+- Compute: arm64 MacOS
+- Time: About 1 hour