GliteTech
/

DisamBertCrossEncoder-base

Feature Extraction

Generated from Trainer

text-embeddings-inference

Model card Files Files and versions

DisamBertCrossEncoder-base / README.md

PeteBleackley's picture

End of training

3609883 verified about 1 month ago

|

history blame contribute delete

3.05 kB

	---
	library_name: transformers
	language:
	- en
	license: apache-2.0
	base_model: answerdotai/ModernBERT-base
	tags:
	- generated_from_trainer
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	- matthews_correlation
	model-index:
	- name: DisamBertCrossEncoder-base
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# DisamBertCrossEncoder-base

	This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3160
	- Precision: 0.6783
	- Recall: 0.5978
	- F1: 0.6355
	- Accuracy: 0.9378
	- Matthews Correlation: 0.6031

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- gradient_accumulation_steps: 5
	- total_train_batch_size: 320
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Precision \| Recall \| F1 \| Accuracy \| Matthews Correlation \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:---------:\|:------:\|:------:\|:--------:\|:--------------------:\|
	\| No log \| 0 \| 0 \| 1123.2456 \| 0.0907 \| 1.0 \| 0.1663 \| 0.0909 \| 0.0045 \|
	\| 0.1943 \| 1.0 \| 9050 \| 0.1832 \| 0.7346 \| 0.2615 \| 0.3857 \| 0.9245 \| 0.4096 \|
	\| 0.1500 \| 2.0 \| 18100 \| 0.1551 \| 0.7019 \| 0.4967 \| 0.5817 \| 0.9352 \| 0.5574 \|
	\| 0.1242 \| 3.0 \| 27150 \| 0.1481 \| 0.7381 \| 0.5451 \| 0.6271 \| 0.9412 \| 0.6040 \|
	\| 0.1017 \| 4.0 \| 36200 \| 0.1482 \| 0.7413 \| 0.5604 \| 0.6383 \| 0.9424 \| 0.6147 \|
	\| 0.0774 \| 5.0 \| 45250 \| 0.1564 \| 0.7179 \| 0.6154 \| 0.6627 \| 0.9432 \| 0.6342 \|
	\| 0.0610 \| 6.0 \| 54300 \| 0.1859 \| 0.7579 \| 0.5297 \| 0.6235 \| 0.9420 \| 0.6044 \|
	\| 0.0434 \| 7.0 \| 63350 \| 0.2016 \| 0.6754 \| 0.6264 \| 0.6499 \| 0.9388 \| 0.6170 \|
	\| 0.0298 \| 8.0 \| 72400 \| 0.2480 \| 0.6520 \| 0.6505 \| 0.6513 \| 0.9368 \| 0.6165 \|
	\| 0.0216 \| 9.0 \| 81450 \| 0.2961 \| 0.6819 \| 0.5890 \| 0.6321 \| 0.9378 \| 0.6002 \|
	\| 0.0174 \| 10.0 \| 90500 \| 0.3160 \| 0.6783 \| 0.5978 \| 0.6355 \| 0.9378 \| 0.6031 \|


	### Framework versions

	- Transformers 5.3.0
	- Pytorch 2.10.0+cu128
	- Datasets 4.5.0
	- Tokenizers 0.22.2