Legal_GQA_BERT7

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0	5	3.9923
No log	2.0	10	3.5716
No log	3.0	15	3.4393
No log	4.0	20	3.4841
No log	5.0	25	3.5053
No log	6.0	30	3.6398
No log	7.0	35	3.9208
No log	8.0	40	4.0914
No log	9.0	45	4.2757
No log	10.0	50	4.3265
No log	11.0	55	4.4341
No log	12.0	60	4.6095
No log	13.0	65	4.7127
No log	14.0	70	4.7583
No log	15.0	75	4.7729
No log	16.0	80	4.7785

Safetensors

Model size

0.1B params

Tensor type

F32