bert-BBc-classifier / README.md
BluSerK's picture
Update README.md
27bbcc2 verified
metadata
library_name: transformers
license: apache-2.0
base_model: google-bert/bert-base-uncased
tags:
  - generated_from_trainer
model-index:
  - name: bert-BBc-classifier
    results: []

BERT News Category Classifier

This model is a fine-tuned version of bert-base-uncased optimized to classify articles into 5 categories (Business, Tech, Politics, Sports, Entertainment).

Model Description

  • Architecture: BERT-base-uncased with frozen base layers for training efficiency.
  • Task: Multi-class Text Classification (NLP Pipeline).
  • Performance: Achieved a 0.96 Macro-F1 score on evaluation.

Training and Evaluation Data

  • Dataset: BBC News Dataset.
  • Preprocessing: Cleaned text fields tokenized using the standard BERT WordPiece tokenizer.

Intended Uses & Limitations

This model is intended for production-ready news classification pipelines. It is lightweight due to layer-freezing optimization during training.

bert-BBc-classifier

This model is a fine-tuned version of google-bert/bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0873

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
0.0635 1.0 213 0.1708
0.0695 2.0 426 0.1116
0.0677 3.0 639 0.0842
0.0525 4.0 852 0.0882
0.0511 5.0 1065 0.0873

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1