bakirgrbic
/

electra-tiny

Text Classification

Model card Files Files and versions

A pretrained ELECTRA-Tiny model. Used pretraining data from the 2024 BabyLM Challenge. Used to perform text classification on the Web of Science Dataset WOS-46985 but this model is not currently finetuned for that task. Also evaluated on BLiMP using the 2024 BabyLM evaluation pipeline.

Training

Used pretraining pipeline as defined in this repository.

Hyperparameters

Epochs: 10
Batch size: 8
Learning rate: 1e-4
Optimizer: AdamW

Resources Used

Compute: AWS Sagemaker ml.g4dn.xlarge
Time: About 70 hours or 3 days

Evaluation

Web of Science (WOS)

Used WOS pipeline as defined in this repository.

Results

76% accuracy on the last epoch of the test set.

Hyperparameters

Epochs: 3
Batch size: 64
Learning rate: 2e-5
Optimizer: AdamW
Max Length: 128
Parameter Freezing: None

Resources Used

Compute: AWS Sagemaker ml.g4dn.xlarge
Time: About 5 minutes

BLiMP

Results

blimp_supplement accuracy: 49.79%
blimp_filtered accuracy: 50.65%
See blimp_results for a detailed breakdown on subtasks.

Hyperparameters

Epochs: 1
Script modified for masked LMs

Resources Used

Compute: arm64 MacOS
Time: About 1 hour

Downloads last month: 29

Safetensors

Model size

5.75M params

Tensor type

F32

·

Model tree for bakirgrbic/electra-tiny

Base model

bsu-slim/electra-tiny

Finetuned

(2)

this model