TinyModel1 / README.md
staindart's picture
Deploy TinyModel1 from GitHub Actions
6df69f6 verified
---
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
datasets:
- fancyzhx/ag_news
language:
- en
tags:
- tiny
- bert
- text-classification
---
<div align="center">
<img src="TinyModel1Image.png" alt="TinyModel1" style="max-width: 100%; width: 100%; height: auto; display: block;" />
</div>
# TinyModel1
**TinyModel1** is a compact **encoder** model for **news topic classification, trained on the AG News dataset. It targets fast CPU/GPU inference and use as a baseline.**
## Links
- **Source code (train & export):** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
- **Live demo (Space):** [TinyModel1Space](https://huggingface.co/spaces/HyperlinksSpace/TinyModel1Space) (canonical Hub URL; avoids unreliable `*.hf.space` links)
---
## Model summary
| Field | Value |
|:--|:--|
| **Task** | Text classification (single-label, 4 classes) |
| **Labels** | World, Sports, Business, Sci/Tech |
| **Dataset** | `fancyzhx/ag_news` |
| **Architecture** | Tiny BERT-style encoder (`BertForSequenceClassification`) |
| **Parameters** | 1,339,268 (~1.34M) |
| **Max sequence length** | 128 tokens (training & inference) |
| **Framework** | [Transformers](https://github.com/huggingface/transformers) · Safetensors |
---
## Model overview
Trained with a WordPiece tokenizer fit on the training split and a shallow BERT stack. Replace the dataset and labels via `scripts/train_tinymodel1_classifier.py` for your own taxonomy.
### **Core capabilities**
- **Text routing** — assign one class per input for search, feeds, or triage.
- **Low latency** — small parameter count suits edge and serverless setups.
- **Fine-tuning base** — swap labels or data for your domain while keeping the same architecture.
---
## Training
| Setting | Value |
|:--|:--|
| **Train samples (cap)** | 3000 |
| **Eval samples (cap)** | 600 |
| **Epochs** | 2 |
| **Batch size** | 16 |
| **Learning rate** | 0.0001 |
| **Optimizer** | AdamW |
---
## Evaluation
| Metric | Value |
|:--|:--|
| **Accuracy** | 0.5150 |
| **Macro F1** | 0.4262 |
| **Weighted F1** | 0.4240 |
| **Final train loss** | 1.1535 |
Per-class F1 and the confusion matrix are saved in `eval_report.json` in this model directory.
Metrics are computed on the held-out eval subset (see `eval_report.json``reproducibility`); treat them as a **sanity-check baseline**, not a production SLA.
---
## Getting started
### Inference with `transformers`
```python
from transformers import pipeline
clf = pipeline(
"text-classification",
model="TinyModel1",
tokenizer="TinyModel1",
top_k=None,
)
text = "Your input text here."
print(clf(text))
```
Use `top_k=None` (or your Transformers version’s equivalent) for scores for **all** labels. Replace `"TinyModel1"` with your Hub model id when loading from the Hub.
---
## Training data
- **Dataset:** `fancyzhx/ag_news` (text column mapped for training; see `artifact.json`).
- **Preprocessing:** tokenizer trained on training texts; sequences truncated to 128 tokens.
---
## Intended use
- Prototyping **routing**, **tagging**, and **dashboard** features over short text.
- Teaching and benchmarking small-classification setups.
- Starting point for **domain adaptation** with your own labels.
---
## Limitations
- **Accuracy** is modest by design; validate on your data before high-stakes use.
- **Not a general-purpose language model** — classification head only; for generation use an LM.
- **Tokenizer and labels** are tied to this training run; mismatched inputs may degrade.
---
## License
This model is released under the **Apache 2.0** license (see repository `LICENSE` where applicable).