File size: 3,730 Bytes
dbde804 0ba0739 dbde804 6df69f6 0ba0739 dbde804 0ba0739 dbde804 0ba0739 b8e9eee 0ba0739 dbde804 6df69f6 0ba0739 b8e9eee 7a87789 b8e9eee 0ba0739 1370f8f 0ba0739 6df69f6 0ba0739 6df69f6 dbde804 0ba0739 dbde804 6df69f6 dbde804 0ba0739 6df69f6 0ba0739 dbde804 0ba0739 6df69f6 0ba0739 6df69f6 0ba0739 dbde804 0ba0739 dbde804 0ba0739 6df69f6 0ba0739 6df69f6 0ba0739 6df69f6 0ba0739 6df69f6 0ba0739 dbde804 6df69f6 0ba0739 6df69f6 0ba0739 6df69f6 0ba0739 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | ---
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
datasets:
- fancyzhx/ag_news
language:
- en
tags:
- tiny
- bert
- text-classification
---
<div align="center">
<img src="TinyModel1Image.png" alt="TinyModel1" style="max-width: 100%; width: 100%; height: auto; display: block;" />
</div>
# TinyModel1
**TinyModel1** is a compact **encoder** model for **news topic classification, trained on the AG News dataset. It targets fast CPU/GPU inference and use as a baseline.**
## Links
- **Source code (train & export):** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
- **Live demo (Space):** [TinyModel1Space](https://huggingface.co/spaces/HyperlinksSpace/TinyModel1Space) (canonical Hub URL; avoids unreliable `*.hf.space` links)
---
## Model summary
| Field | Value |
|:--|:--|
| **Task** | Text classification (single-label, 4 classes) |
| **Labels** | World, Sports, Business, Sci/Tech |
| **Dataset** | `fancyzhx/ag_news` |
| **Architecture** | Tiny BERT-style encoder (`BertForSequenceClassification`) |
| **Parameters** | 1,339,268 (~1.34M) |
| **Max sequence length** | 128 tokens (training & inference) |
| **Framework** | [Transformers](https://github.com/huggingface/transformers) · Safetensors |
---
## Model overview
Trained with a WordPiece tokenizer fit on the training split and a shallow BERT stack. Replace the dataset and labels via `scripts/train_tinymodel1_classifier.py` for your own taxonomy.
### **Core capabilities**
- **Text routing** — assign one class per input for search, feeds, or triage.
- **Low latency** — small parameter count suits edge and serverless setups.
- **Fine-tuning base** — swap labels or data for your domain while keeping the same architecture.
---
## Training
| Setting | Value |
|:--|:--|
| **Train samples (cap)** | 3000 |
| **Eval samples (cap)** | 600 |
| **Epochs** | 2 |
| **Batch size** | 16 |
| **Learning rate** | 0.0001 |
| **Optimizer** | AdamW |
---
## Evaluation
| Metric | Value |
|:--|:--|
| **Accuracy** | 0.5150 |
| **Macro F1** | 0.4262 |
| **Weighted F1** | 0.4240 |
| **Final train loss** | 1.1535 |
Per-class F1 and the confusion matrix are saved in `eval_report.json` in this model directory.
Metrics are computed on the held-out eval subset (see `eval_report.json` → `reproducibility`); treat them as a **sanity-check baseline**, not a production SLA.
---
## Getting started
### Inference with `transformers`
```python
from transformers import pipeline
clf = pipeline(
"text-classification",
model="TinyModel1",
tokenizer="TinyModel1",
top_k=None,
)
text = "Your input text here."
print(clf(text))
```
Use `top_k=None` (or your Transformers version’s equivalent) for scores for **all** labels. Replace `"TinyModel1"` with your Hub model id when loading from the Hub.
---
## Training data
- **Dataset:** `fancyzhx/ag_news` (text column mapped for training; see `artifact.json`).
- **Preprocessing:** tokenizer trained on training texts; sequences truncated to 128 tokens.
---
## Intended use
- Prototyping **routing**, **tagging**, and **dashboard** features over short text.
- Teaching and benchmarking small-classification setups.
- Starting point for **domain adaptation** with your own labels.
---
## Limitations
- **Accuracy** is modest by design; validate on your data before high-stakes use.
- **Not a general-purpose language model** — classification head only; for generation use an LM.
- **Tokenizer and labels** are tied to this training run; mismatched inputs may degrade.
---
## License
This model is released under the **Apache 2.0** license (see repository `LICENSE` where applicable).
|