shepherdgroup
/

NuTCRacker

Model card Files Files and versions

TruptiG commited on May 20, 2025

Commit

9fd9900

·

verified ·

1 Parent(s): d83c08e

Update README.md

Model Card: Training hyper parameters

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -88,6 +88,25 @@ Use the code below to get started with the model.
 ## Training Details
 ### Training Data
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

 ## Training Details
+## Training hyperparameters
+```
+vocab_size=len(tokenizer),
+    num_attention_heads=8,
+    num_hidden_layers=16,
+    hidden_size=512,
+    intermediate_size=2048,
+    hidden_act='gelu',
+    hidden_dropout_prob=0.15,
+    relative_attention=True,
+    pos_att_type='c2p|p2c',
+    max_relative_positions=-1,
+    position_biased_input=False,
+    attention_probs_dropout_prob=0.15,
+    initializer_range=0.02,
+    layer_norm_eps=1e-7,
+````
 ### Training Data
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->