File size: 3,730 Bytes
dbde804
 
 
0ba0739
dbde804
6df69f6
0ba0739
 
dbde804
 
0ba0739
dbde804
 
 
0ba0739
b8e9eee
0ba0739
 
dbde804
 
6df69f6
0ba0739
b8e9eee
 
 
7a87789
b8e9eee
 
0ba0739
 
 
 
1370f8f
0ba0739
 
 
6df69f6
0ba0739
 
 
 
 
 
 
 
 
6df69f6
dbde804
0ba0739
dbde804
6df69f6
 
 
dbde804
0ba0739
 
 
 
 
 
6df69f6
 
0ba0739
 
 
 
 
 
dbde804
0ba0739
 
 
 
6df69f6
 
 
 
0ba0739
6df69f6
 
 
0ba0739
 
dbde804
0ba0739
dbde804
0ba0739
 
 
 
 
 
 
6df69f6
0ba0739
 
 
6df69f6
0ba0739
 
 
6df69f6
0ba0739
 
 
 
 
6df69f6
0ba0739
 
 
dbde804
 
 
6df69f6
0ba0739
6df69f6
0ba0739
 
 
 
 
6df69f6
 
 
0ba0739
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
datasets:
  - fancyzhx/ag_news
language:
  - en
tags:
  - tiny
  - bert
  - text-classification
---

<div align="center">
  <img src="TinyModel1Image.png" alt="TinyModel1" style="max-width: 100%; width: 100%; height: auto; display: block;" />
</div>

# TinyModel1

**TinyModel1** is a compact **encoder** model for **news topic classification, trained on the AG News dataset. It targets fast CPU/GPU inference and use as a baseline.**

## Links

- **Source code (train & export):** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
- **Live demo (Space):** [TinyModel1Space](https://huggingface.co/spaces/HyperlinksSpace/TinyModel1Space) (canonical Hub URL; avoids unreliable `*.hf.space` links)


---

## Model summary

| Field | Value |
|:--|:--|
| **Task** | Text classification (single-label, 4 classes) |
| **Labels** | World, Sports, Business, Sci/Tech |
| **Dataset** | `fancyzhx/ag_news` |
| **Architecture** | Tiny BERT-style encoder (`BertForSequenceClassification`) |
| **Parameters** | 1,339,268 (~1.34M) |
| **Max sequence length** | 128 tokens (training & inference) |
| **Framework** | [Transformers](https://github.com/huggingface/transformers) · Safetensors |

---

## Model overview

Trained with a WordPiece tokenizer fit on the training split and a shallow BERT stack. Replace the dataset and labels via `scripts/train_tinymodel1_classifier.py` for your own taxonomy.

### **Core capabilities**

- **Text routing** — assign one class per input for search, feeds, or triage.
- **Low latency** — small parameter count suits edge and serverless setups.
- **Fine-tuning base** — swap labels or data for your domain while keeping the same architecture.

---

## Training

| Setting | Value |
|:--|:--|
| **Train samples (cap)** | 3000 |
| **Eval samples (cap)** | 600 |
| **Epochs** | 2 |
| **Batch size** | 16 |
| **Learning rate** | 0.0001 |
| **Optimizer** | AdamW |

---

## Evaluation

| Metric | Value |
|:--|:--|
| **Accuracy** | 0.5150 |
| **Macro F1** | 0.4262 |
| **Weighted F1** | 0.4240 |
| **Final train loss** | 1.1535 |

Per-class F1 and the confusion matrix are saved in `eval_report.json` in this model directory.

Metrics are computed on the held-out eval subset (see `eval_report.json``reproducibility`); treat them as a **sanity-check baseline**, not a production SLA.

---

## Getting started

### Inference with `transformers`

```python
from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="TinyModel1",
    tokenizer="TinyModel1",
    top_k=None,
)
text = "Your input text here."
print(clf(text))
```

Use `top_k=None` (or your Transformers version’s equivalent) for scores for **all** labels. Replace `"TinyModel1"` with your Hub model id when loading from the Hub.

---

## Training data

- **Dataset:** `fancyzhx/ag_news` (text column mapped for training; see `artifact.json`).
- **Preprocessing:** tokenizer trained on training texts; sequences truncated to 128 tokens.

---

## Intended use

- Prototyping **routing**, **tagging**, and **dashboard** features over short text.
- Teaching and benchmarking small-classification setups.
- Starting point for **domain adaptation** with your own labels.

---

## Limitations

- **Accuracy** is modest by design; validate on your data before high-stakes use.
- **Not a general-purpose language model** — classification head only; for generation use an LM.
- **Tokenizer and labels** are tied to this training run; mismatched inputs may degrade.

---

## License

This model is released under the **Apache 2.0** license (see repository `LICENSE` where applicable).