aiface
/

ModernBERT-base_nli

Text Classification

Generated from Trainer

Model card Files Files and versions

ModernBERT-base_nli / README.md

aiface's picture

Model save

1d4ecd2 verified 5 months ago

|

history blame contribute delete

4.16 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: answerdotai/ModernBERT-base
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: ModernBERT-base_nli
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# ModernBERT-base_nli

	This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.4416
	- Accuracy: 0.5623
	- Precision Macro: 0.5618
	- Recall Macro: 0.5627
	- F1 Macro: 0.5621
	- F1 Weighted: 0.5617

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 128
	- eval_batch_size: 128
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 256
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 20
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Precision Macro \| Recall Macro \| F1 Macro \| F1 Weighted \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:---------------:\|:------------:\|:--------:\|:-----------:\|
	\| 2.164 \| 1.0 \| 72 \| 1.0434 \| 0.4483 \| 0.4472 \| 0.4484 \| 0.4398 \| 0.4395 \|
	\| 2.0623 \| 2.0 \| 144 \| 0.9968 \| 0.4984 \| 0.5026 \| 0.4994 \| 0.4983 \| 0.4978 \|
	\| 1.8507 \| 3.0 \| 216 \| 1.0155 \| 0.5016 \| 0.5522 \| 0.5034 \| 0.4808 \| 0.4802 \|
	\| 1.7076 \| 4.0 \| 288 \| 0.9344 \| 0.5721 \| 0.5902 \| 0.5738 \| 0.5572 \| 0.5563 \|
	\| 1.4431 \| 5.0 \| 360 \| 0.9258 \| 0.5756 \| 0.5770 \| 0.5768 \| 0.5719 \| 0.5714 \|
	\| 1.1592 \| 6.0 \| 432 \| 1.0425 \| 0.5738 \| 0.5831 \| 0.5740 \| 0.5693 \| 0.5691 \|
	\| 0.6916 \| 7.0 \| 504 \| 1.2622 \| 0.5659 \| 0.5711 \| 0.5670 \| 0.5640 \| 0.5636 \|
	\| 0.3547 \| 8.0 \| 576 \| 1.7560 \| 0.5455 \| 0.5495 \| 0.5452 \| 0.5460 \| 0.5459 \|
	\| 0.2534 \| 9.0 \| 648 \| 2.1882 \| 0.5494 \| 0.5620 \| 0.5515 \| 0.5409 \| 0.5401 \|
	\| 0.1018 \| 10.0 \| 720 \| 2.3462 \| 0.5645 \| 0.5641 \| 0.5652 \| 0.5633 \| 0.5630 \|
	\| 0.0931 \| 11.0 \| 792 \| 2.6256 \| 0.5565 \| 0.5619 \| 0.5582 \| 0.5483 \| 0.5475 \|
	\| 0.0504 \| 12.0 \| 864 \| 2.7252 \| 0.5552 \| 0.5570 \| 0.5557 \| 0.5555 \| 0.5551 \|
	\| 0.0379 \| 13.0 \| 936 \| 2.9577 \| 0.5517 \| 0.5518 \| 0.5521 \| 0.5518 \| 0.5515 \|
	\| 0.0111 \| 14.0 \| 1008 \| 3.2048 \| 0.5614 \| 0.5621 \| 0.5621 \| 0.5609 \| 0.5604 \|
	\| 0.0018 \| 15.0 \| 1080 \| 3.3005 \| 0.5610 \| 0.5621 \| 0.5612 \| 0.5616 \| 0.5613 \|
	\| 0.0003 \| 16.0 \| 1152 \| 3.3958 \| 0.5610 \| 0.5602 \| 0.5615 \| 0.5605 \| 0.5601 \|
	\| 0.0001 \| 17.0 \| 1224 \| 3.4259 \| 0.5623 \| 0.5617 \| 0.5628 \| 0.5620 \| 0.5617 \|
	\| 0.0001 \| 18.0 \| 1296 \| 3.4368 \| 0.5619 \| 0.5613 \| 0.5623 \| 0.5616 \| 0.5612 \|
	\| 0.0001 \| 19.0 \| 1368 \| 3.4412 \| 0.5619 \| 0.5614 \| 0.5623 \| 0.5616 \| 0.5613 \|
	\| 0.0001 \| 20.0 \| 1440 \| 3.4416 \| 0.5623 \| 0.5618 \| 0.5627 \| 0.5621 \| 0.5617 \|


	### Framework versions

	- Transformers 4.55.0
	- Pytorch 2.7.0+cu126
	- Datasets 4.0.0
	- Tokenizers 0.21.4