bakirgrbic commited on
Commit
f7c6e15
·
verified ·
1 Parent(s): 052269f
Files changed (1) hide show
  1. README.md +30 -10
README.md CHANGED
@@ -11,29 +11,32 @@ library_name: transformers
11
  A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Pretraining [data](https://osf.io/5mk3x)
12
  was from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used personally to perform text classification
13
  on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently fine-tuned
14
- for that task.
15
 
16
 
17
  # Training
18
  Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
19
 
20
  ## Hyperparameters
21
- - Epochs: 1
22
  - Batch size: 8
23
  - Learning rate: 1e-4
24
  - Optimizer: AdamW
25
 
26
  ## Resources Used
27
  - Compute: AWS Sagemaker ml.g4dn.xlarge
28
- - Time: About 7 hours
29
 
30
- # Evaluation (Web of Science)
31
- Used wos pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
32
 
33
- ## Results
34
- - 64% accuracy on the last epoch of the test set.
35
 
36
- ## Hyperparameters
 
 
 
 
 
 
37
  - Epochs: 3
38
  - Batch size: 64
39
  - Learning rate: 2e-5
@@ -41,6 +44,23 @@ Used wos pipeline as defined in this [repository](https://github.com/bakirgrbic/
41
  - Max Length: 128
42
  - Parameter Freezing: None
43
 
44
- ## Resources Used
45
  - Compute: AWS Sagemaker ml.g4dn.xlarge
46
- - Time: About 5 minutes
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  A pretrained [ELECTRA-Tiny](https://huggingface.co/bsu-slim/electra-tiny/tree/main) model. Pretraining [data](https://osf.io/5mk3x)
12
  was from the [2024 BabyLM Challenge](https://babylm.github.io/index.html). Used personally to perform text classification
13
  on the [Web of Science Dataset WOS-46985](https://data.mendeley.com/datasets/9rw3vkcfy4/6) but this model is not currently fine-tuned
14
+ for that task. Also evaluated on BLiMP using a pipeline provided by the 2024 BabyLM challenge.
15
 
16
 
17
  # Training
18
  Used pretraining pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
19
 
20
  ## Hyperparameters
21
+ - Epochs: 10
22
  - Batch size: 8
23
  - Learning rate: 1e-4
24
  - Optimizer: AdamW
25
 
26
  ## Resources Used
27
  - Compute: AWS Sagemaker ml.g4dn.xlarge
28
+ - Time: About 70 hours or 3 days
29
 
 
 
30
 
31
+ # Evaluation
 
32
 
33
+ ## Web of Science (WOS)
34
+ Used WOS pipeline as defined in this [repository](https://github.com/bakirgrbic/bblm).
35
+
36
+ ### Results
37
+ - 76% accuracy on the last epoch of the test set.
38
+
39
+ ### Hyperparameters
40
  - Epochs: 3
41
  - Batch size: 64
42
  - Learning rate: 2e-5
 
44
  - Max Length: 128
45
  - Parameter Freezing: None
46
 
47
+ ### Resources Used
48
  - Compute: AWS Sagemaker ml.g4dn.xlarge
49
+ - Time: About 5 minutes
50
+
51
+
52
+ ## BLiMP
53
+ Used BLiMP evaluation from the [2024 BabyLM evaluation pipeline repository](https://github.com/babylm/evaluation-pipeline-2024).
54
+
55
+ ### Results
56
+ - blimp_supplement accuracy: 49.79%
57
+ - blimp_filtered accuracy: 50.65%
58
+ - See [blimp_results](./blimp_results) for a detailed breakdown on subtasks.
59
+
60
+ ### Hyperparameters
61
+ - Epochs: 1
62
+ - Script modified for masked LMs
63
+
64
+ ### Resources Used
65
+ - Compute: arm64 MacOS
66
+ - Time: About 1 hour