BERT Fine-Tuned on News Articles Categorization
A fine-tuned BERT model using the News Articles Categorization dataset.
Model Details
Description
This model is based on the BERT base (uncased) architecture and has been fine-tuned on the News Articles Categorization dataset.
- Developed by: Cesar Gonzalez-Gutierrez
- Funded by: ERC
- Architecture: BERT-base
- Base model: BERT base model (uncased)
- Language: English
- License: Apache 2.0
Seed Initializations
Alternative models trained using different initialization seeds are available and can be accessed using specific branches:
| Random Seed | Branch |
|---|---|
| 120 | seed-120 |
| 220 | seed-220 |
| 320 | seed-320 |
| 420 | seed-420 |
| 520 | seed-520 |
To load a model from a specific branch, use the revision parameter:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("<model>", revision="seed-120")
Sources
[Information pending]
Training Details
Fine-tuning was performed end-to-end using a grid search over key hyperparameters. Model performance was evaluated based on validation loss computed on the development set. After identifying the optimal hyperparameter configuration, the final model was retrained on the entire training dataset.
Training Data
The original News Articles Categorization dataset was randomly split into 60%, 20%, and 20% partitions for training, development, and testing, respectively.
Training Hyperparameters
- Epochs: 1-4
- Batch size: {16, 32}
- Learning rate: {5e-5, 3e-5, 2e-5}
- Validation metric: loss
- Precision: fp16
Uses
This model can be used for classification tasks aligned with the structure and intent of the News Articles Categorization dataset.
For broader guidance, refer to the BERT base model’s Inteded Uses & Limitations.
Bias, Risks, and Limitations
This model inherits the potential risks and limitations of its base model. For more details, refer to the Limitations and bias section of the original model documentation.
Additionally, it may reflect or amplify patterns and biases present in the News Articles Categorization training data.
Hardware
- Hardware Type: NVIDIA Tesla V100 PCIE 32GB
- Cluster Provider: Artemisa
- Compute Region: EU
Citation
If you use this model in your research, please cite both the base BERT model and the News Articles Categorization source.
- Downloads last month
- 2
Model tree for cglez/bert-base-uncased-ft-news_articles
Base model
google-bert/bert-base-uncased