-- license: apache-2.0 language:

  • en metrics:
  • accuracy pipeline_tag: text-classification

Transformer Encoder for AG News

Overview

This repository contains a custom Transformer Encoder trained from scratch on the AG News dataset.
The model achieves 91.7% validation accuracy after 13 epochs with the following configuration:

  • vocab size: 30,000
  • embedding dim (d_model): 256
  • heads: 8
  • layers: 4
  • feedforward dim: 512
  • max seq length: 256

Dataset

  • AG News (news classification)
  • 4 categories: World, Sports, Business, Sci/Tech

Training

  • Optimizer: AdamW (lr=3e-4, weight decay=1e-5)
  • Batch size: 32
  • Epochs: 15
  • Hardware: Google Colab GPU T4

Results

  • Train Accuracy: 92.1%
  • Validation Accuracy: 91.7%

Files

  • agnews_encoder.pt β†’ trained model weights
  • config.json β†’ model config
  • vocab.json β†’ tokenizer vocab
  • train_notebook.ipynb β†’ training code

Source Code

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support