MarioBarbeque
/

DistilBERT-DeNiro

Model card Files Files and versions

MarioBarbeque commited on Nov 24, 2024

Commit

d201bb9

·

verified ·

1 Parent(s): a8d360a

add hyperparameters

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -106,7 +106,11 @@ The model was trained locally on a single-node with one 16GB Nvidia T4 using
 #### Training Hyperparameters
-- **Training regime:** We use FP32 precision, as follows immediately from the precision inhereted for the original "DistilBERT/distilbert-base-uncased" model.
 ## Evaluation / Metrics

 #### Training Hyperparameters
+- **Precision:** We use FP32 precision, as follows immediately from the precision inhereted for the original "DistilBERT/distilbert-base-uncased" model.
+- **Optimizer:** AdamW
+- **Learning Rate:** We use a linear learing rate scheduler with an initial learning rate of 5e-5
+- **Batch Size:** 32
+- **Number of Training Steps**: 2877 steps over the course of 3 epochs
 ## Evaluation / Metrics