a-mannion
/

drbert-umls-kgi

Model card Files Files and versions

Aidan Mannion commited on Nov 14, 2023

Commit

ee1b6d6

·

1 Parent(s): e0b4b0e

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -63,6 +63,7 @@ Experiments on general-domain data suggest that, given it's specialised training
 - linear learning rate schedule with 10,770 warmup steps
 - effective batch size 1500 (15 sequences per batch x 100 gradient accumulation steps)
 - MLM masking probability 0.15
 **Training regime:** The model was trained with fp16 non-mixed precision, using the AdamW optimizer with default parameters.

 - linear learning rate schedule with 10,770 warmup steps
 - effective batch size 1500 (15 sequences per batch x 100 gradient accumulation steps)
 - MLM masking probability 0.15
 **Training regime:** The model was trained with fp16 non-mixed precision, using the AdamW optimizer with default parameters.