Classifies English-language news into 4 categories: world, sports, business, and science / technology.
Fine-tuned version of distilbert/distilbert-base-uncased on fancyzhx/ag_news, trained for 3 epochs with 128 token truncation.
It achieves the following results on the evaluation set:
- Loss: 0.1759
- Accuracy: 0.9414
Made as a homework project for 4th lesson of the FastAI's Practical Deep Learning for Coders course. Hugging Face Spaces demo available here.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 0.2009 | 1.0 | 938 | 0.1998 | 0.9307 |
| 0.1773 | 2.0 | 1876 | 0.1804 | 0.9375 |
| 0.1418 | 3.0 | 2814 | 0.1759 | 0.9414 |
Framework versions
- Transformers 4.56.1
- Pytorch 2.8.0+cu126
- Datasets 4.1.0
- Tokenizers 0.22.0
- Downloads last month
- -
Model tree for kitrofimov/news-clf
Base model
distilbert/distilbert-base-uncased