-- license: apache-2.0 language:
- en metrics:
- accuracy pipeline_tag: text-classification
Transformer Encoder for AG News
Overview
This repository contains a custom Transformer Encoder trained from scratch on the AG News dataset.
The model achieves 91.7% validation accuracy after 13 epochs with the following configuration:
- vocab size: 30,000
- embedding dim (d_model): 256
- heads: 8
- layers: 4
- feedforward dim: 512
- max seq length: 256
Dataset
- AG News (news classification)
- 4 categories: World, Sports, Business, Sci/Tech
Training
- Optimizer: AdamW (lr=3e-4, weight decay=1e-5)
- Batch size: 32
- Epochs: 15
- Hardware: Google Colab GPU T4
Results
- Train Accuracy: 92.1%
- Validation Accuracy: 91.7%
Files
agnews_encoder.ptβ trained model weightsconfig.jsonβ model configvocab.jsonβ tokenizer vocabtrain_notebook.ipynbβ training code
Source Code
- Downloads last month
- 13