tyavika/QAModel_Distilbert_b16_20_3e5

This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'inner_optimizer': {'class_name': 'Custom>Adam', 'config': {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 3e-05, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}}, 'dynamic': True, 'initial_scale': 32768.0, 'dynamic_growth_steps': 2000}
training_precision: mixed_float16

Train Loss	Train End Logits Accuracy	Train Start Logits Accuracy	Validation Loss	Validation End Logits Accuracy	Validation Start Logits Accuracy	Epoch
2.2867	0.4392	0.4104	1.5112	0.6056	0.5722	0
1.1957	0.6764	0.6470	1.4029	0.6343	0.5930	1
0.7345	0.7904	0.7696	1.5582	0.6177	0.5801	2
0.4271	0.8747	0.8628	1.7887	0.6175	0.5786	3