binary_paragraph / README.md
harun27's picture
Training in progress, epoch 1
aaa64e1 verified
|
raw
history blame
4.28 kB
metadata
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-large
tags:
  - generated_from_trainer
model-index:
  - name: binary_paragraph
    results: []

binary_paragraph

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2290
  • Classification Report: {'0': {'precision': 0.9048991354466859, 'recall': 0.9861809045226131, 'f1-score': 0.9437932070934776, 'support': 1592.0}, '1': {'precision': 0.811965811965812, 'recall': 0.36538461538461536, 'f1-score': 0.5039787798408488, 'support': 260.0}, 'accuracy': 0.8990280777537797, 'macro avg': {'precision': 0.858432473706249, 'recall': 0.6757827599536143, 'f1-score': 0.7238859934671632, 'support': 1852.0}, 'weighted avg': {'precision': 0.891852340573561, 'recall': 0.8990280777537797, 'f1-score': 0.8820482011076873, 'support': 1852.0}}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 22
  • eval_batch_size: 22
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 88
  • total_eval_batch_size: 88
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Classification Report
No log 1.0 71 0.2510 {'0': {'precision': 0.8783185840707964, 'recall': 0.9974874371859297, 'f1-score': 0.9341176470588235, 'support': 1592.0}, '1': {'precision': 0.9090909090909091, 'recall': 0.15384615384615385, 'f1-score': 0.2631578947368421, 'support': 260.0}, 'accuracy': 0.8790496760259179, 'macro avg': {'precision': 0.8937047465808527, 'recall': 0.5756667955160417, 'f1-score': 0.5986377708978328, 'support': 1852.0}, 'weighted avg': {'precision': 0.8826386728965142, 'recall': 0.8790496760259179, 'f1-score': 0.839922433449906, 'support': 1852.0}}
No log 2.0 142 0.2290 {'0': {'precision': 0.9048991354466859, 'recall': 0.9861809045226131, 'f1-score': 0.9437932070934776, 'support': 1592.0}, '1': {'precision': 0.811965811965812, 'recall': 0.36538461538461536, 'f1-score': 0.5039787798408488, 'support': 260.0}, 'accuracy': 0.8990280777537797, 'macro avg': {'precision': 0.858432473706249, 'recall': 0.6757827599536143, 'f1-score': 0.7238859934671632, 'support': 1852.0}, 'weighted avg': {'precision': 0.891852340573561, 'recall': 0.8990280777537797, 'f1-score': 0.8820482011076873, 'support': 1852.0}}

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1