Instructions to use bakirgrbic/electra-tiny-elc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bakirgrbic/electra-tiny-elc with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="bakirgrbic/electra-tiny-elc")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("bakirgrbic/electra-tiny-elc") model = AutoModelForMaskedLM.from_pretrained("bakirgrbic/electra-tiny-elc") - Notebooks
- Google Colab
- Kaggle
This model is currently experimental and broken!
A pretrained ELECTRA-Tiny model modified to implement zero initialization transformer layer weighting as described in Not all layers are equally as important: Every Layer Counts BERT.
Training
Used pretraining pipeline as defined in this repository.
Hyperparameters
- Epochs: 9
- Batch size: 8
- Learning rate: 1e-4
- Optimizer: AdamW
Resources Used
- Compute: AWS Sagemaker ml.g4dn.xlarge
- Time: About 63 hours
Evaluation
BLiMP
Used BLiMP evaluation from the 2024 BabyLM evaluation pipeline repository.
Results
- blimp_supplement accuracy: 47.54%
- blimp_filtered accuracy: 51.79%
- See blimp_results for a detailed breakdown on subtasks.
Hyperparameters
- Epochs: 1
- Script modified for masked LMs
Resources Used
- Compute: arm64 MacOS
- Time: About 30 minutes
- Downloads last month
- 1
Model tree for bakirgrbic/electra-tiny-elc
Base model
bsu-slim/electra-tiny