bakirgrbic
/

electra-tiny-elc

Text Classification

Model card Files Files and versions

bakirgrbic commited on Jul 16, 2025

Commit

e9b4790

·

verified ·

1 Parent(s): 0ae61fe

v1 done

Files changed (1) hide show

README.md +48 -3

README.md CHANGED Viewed

@@ -1,3 +1,48 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- bsu-slim/electra-tiny
+- lgcharpe/ELC_BERT_small_baby_10M
+pipeline_tag: text-classification
+library_name: transformers
+---
+# This model is currently experimental and broken!
+A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model modified to implement zero initialization
+transformer layer weighting as described in
+[Not all layers are equally as important: Every Layer Counts BERT](https://aclanthology.org/2023.conll-babylm.20.pdf).
+# Training
+Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
+## Hyperparameters
+- Epochs: 9
+- Batch size: 8
+- Learning rate: 1e-4
+- Optimizer: AdamW
+## Resources Used
+- Compute: AWS Sagemaker ml.g4dn.xlarge
+- Time: About 63 hours
+# Evaluation
+## BLiMP
+Used BLiMP evaluation from the [2024 BabyLM evaluation pipeline repository](https://github.com/babylm/evaluation-pipeline-2024).
+### Results
+- blimp_supplement accuracy: 47.54%
+- blimp_filtered accuracy: 51.79%
+- See [blimp_results](./blimp_results) for a detailed breakdown on subtasks.
+### Hyperparameters
+- Epochs: 1
+- Script modified for masked LMs
+### Resources Used
+- Compute: arm64 MacOS
+- Time: About 30 minutes