| ---
|
| license: mit
|
| tags:
|
| - pytorch
|
| - nanogpt
|
| - text-classification
|
| - spam-detection
|
| - slm
|
| - from-scratch
|
| base_model: nishantup/nanogpt-slm-124m
|
| ---
|
|
|
| # nanoGPT Spam Classifier -- 123.9M Parameters
|
|
|
| Binary spam classifier fine-tuned from the nanoGPT pretrained SLM.
|
|
|
| **Pipeline:** Trained from scratch -> Pretrained on 133 English fiction books -> Classification fine-tuned on SMS spam dataset.
|
|
|
| ## Quick Start
|
|
|
| ### Option 1: Run directly (downloads model + runs examples)
|
| ```bash
|
| pip install torch tiktoken huggingface_hub
|
| python nanogpt_classifier_inference.py
|
| ```
|
|
|
| ### Option 2: Import and use in your own code
|
| ```python
|
| from nanogpt_classifier_inference import classify, is_spam, classify_batch
|
|
|
| # Full result with confidence
|
| result = classify("You won a free iPhone! Click here to claim.")
|
| print(result)
|
| # {'label': 'spam', 'confidence': 0.95, 'probabilities': {'not spam': 0.05, 'spam': 0.95}}
|
| print()
|
|
|
| # Simple boolean check
|
| print(is_spam("You won a free iPhone!")) # True
|
| print(is_spam("See you at dinner tonight!")) # False
|
| print()
|
|
|
| # Batch classification
|
| texts = ["Free prize!", "Meeting at 3pm", "Click to win $$$"]
|
| results = classify_batch(texts)
|
| for text, r in zip(texts, results):
|
| print(f" {r['label']:>8s} ({r['confidence']:.0%}) | {text}")
|
|
|
| print()
|
| ```
|
|
|
| ### Option 3: Load weights manually
|
| ```python
|
| from huggingface_hub import hf_hub_download
|
| import torch, torch.nn as nn
|
|
|
| model_path = hf_hub_download(
|
| repo_id="nishantup/nanogpt-slm-classifier",
|
| filename="nanogpt_classifier.pth"
|
| )
|
|
|
| from nanogpt_classifier_inference import GPT, GPTConfig
|
|
|
| config = GPTConfig()
|
| model = GPT(config)
|
| model.lm_head = nn.Linear(768, 2) # Replace LM head with 2-class classifier
|
| model.load_state_dict(torch.load(model_path, map_location="cpu"))
|
| model.eval()
|
| ```
|
|
|
| ## How It Works
|
|
|
| 1. Input text is tokenized (tiktoken GPT-2 BPE)
|
| 2. Padded/truncated to 120 tokens
|
| 3. Fed through the full transformer (12 layers)
|
| 4. **Last token's logits** (shape: 2) are used for classification
|
| 5. Argmax -> 0 = "not spam", 1 = "spam"
|
|
|
| ## Model Details
|
|
|
| | Attribute | Value |
|
| |:---|:---|
|
| | Parameters | 123.9M |
|
| | Architecture | nanoGPT (12 layers, 12 heads, 768 dim) |
|
| | Classification head | Linear(768, 2) replacing lm_head |
|
| | Classes | 0 = not spam, 1 = spam |
|
| | Max sequence length | 120 tokens |
|
| | Context length | 256 tokens |
|
| | Tokenizer | tiktoken GPT-2 BPE (50,257 tokens) |
|
| | Base model | [nishantup/nanogpt-slm-124m](https://huggingface.co/nishantup/nanogpt-slm-124m) |
|
| | Training data | UCI SMS Spam Collection (balanced 747+747) |
|
| | Framework | PyTorch |
|
|
|
| ## Training Details
|
|
|
| - Base pretrained model frozen except: last transformer block + final LayerNorm + classification head
|
| - 5 epochs, AdamW (lr=5e-5, weight_decay=0.1), batch_size=8
|
| - Classification uses cross-entropy loss on last-token logits
|
|
|
| ## Files
|
|
|
| | File | Description |
|
| |:---|:---|
|
| | `nanogpt_classifier.pth` | Classifier weights (lm_head = Linear(768, 2)) |
|
| | `nanogpt_classifier_inference.py` | Standalone inference script |
|
| | `config.json` | Model + classifier configuration |
|
|
|
| ## API Reference
|
|
|
| ### `classify(text, max_length=120)`
|
| Returns dict with `label`, `confidence`, `probabilities`.
|
|
|
| ### `is_spam(text, max_length=120)`
|
| Returns `True` if spam, `False` if not.
|
|
|
| ### `classify_batch(texts, max_length=120)`
|
| Returns list of classify() results.
|
|
|
| ## Related Models
|
|
|
| | Variant | Type | Repo |
|
| |:---|:---|:---|
|
| | Pretrained (nanoGPT) | Base LM | [nishantup/nanogpt-slm-124m](https://huggingface.co/nishantup/nanogpt-slm-124m) |
|
| | Instruction-tuned (nanoGPT) | SFT | [nishantup/nanogpt-slm-instruct](https://huggingface.co/nishantup/nanogpt-slm-instruct) |
|
| | **Spam classifier (nanoGPT)** | **Classification** | **[nishantup/nanogpt-slm-classifier](https://huggingface.co/nishantup/nanogpt-slm-classifier)** |
|
|
|