| --- |
| license: apache-2.0 |
| library_name: transformers |
| pipeline_tag: text-classification |
| datasets: |
| - fancyzhx/ag_news |
| language: |
| - en |
| tags: |
| - tiny |
| - bert |
| - text-classification |
| --- |
| |
| <div align="center"> |
| <img src="TinyModel1Image.png" alt="TinyModel1" style="max-width: 100%; width: 100%; height: auto; display: block;" /> |
| </div> |
|
|
| # TinyModel1 |
|
|
| **TinyModel1** is a compact **encoder** model for **news topic classification, trained on the AG News dataset. It targets fast CPU/GPU inference and use as a baseline.** |
|
|
| ## Links |
|
|
| - **Source code (train & export):** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel) |
| - **Live demo (Space):** [TinyModel1Space](https://huggingface.co/spaces/HyperlinksSpace/TinyModel1Space) (canonical Hub URL; avoids unreliable `*.hf.space` links) |
|
|
|
|
| --- |
|
|
| ## Model summary |
|
|
| | Field | Value | |
| |:--|:--| |
| | **Task** | Text classification (single-label, 4 classes) | |
| | **Labels** | World, Sports, Business, Sci/Tech | |
| | **Dataset** | `fancyzhx/ag_news` | |
| | **Architecture** | Tiny BERT-style encoder (`BertForSequenceClassification`) | |
| | **Parameters** | 1,339,268 (~1.34M) | |
| | **Max sequence length** | 128 tokens (training & inference) | |
| | **Framework** | [Transformers](https://github.com/huggingface/transformers) · Safetensors | |
|
|
| --- |
|
|
| ## Model overview |
|
|
| Trained with a WordPiece tokenizer fit on the training split and a shallow BERT stack. Replace the dataset and labels via `scripts/train_tinymodel1_classifier.py` for your own taxonomy. |
|
|
| ### **Core capabilities** |
|
|
| - **Text routing** — assign one class per input for search, feeds, or triage. |
| - **Low latency** — small parameter count suits edge and serverless setups. |
| - **Fine-tuning base** — swap labels or data for your domain while keeping the same architecture. |
|
|
| --- |
|
|
| ## Training |
|
|
| | Setting | Value | |
| |:--|:--| |
| | **Train samples (cap)** | 3000 | |
| | **Eval samples (cap)** | 600 | |
| | **Epochs** | 2 | |
| | **Batch size** | 16 | |
| | **Learning rate** | 0.0001 | |
| | **Optimizer** | AdamW | |
|
|
| --- |
|
|
| ## Evaluation |
|
|
| | Metric | Value | |
| |:--|:--| |
| | **Accuracy** | 0.5150 | |
| | **Macro F1** | 0.4262 | |
| | **Weighted F1** | 0.4240 | |
| | **Final train loss** | 1.1535 | |
|
|
| Per-class F1 and the confusion matrix are saved in `eval_report.json` in this model directory. |
|
|
| Metrics are computed on the held-out eval subset (see `eval_report.json` → `reproducibility`); treat them as a **sanity-check baseline**, not a production SLA. |
|
|
| --- |
|
|
| ## Getting started |
|
|
| ### Inference with `transformers` |
|
|
| ```python |
| from transformers import pipeline |
| |
| clf = pipeline( |
| "text-classification", |
| model="TinyModel1", |
| tokenizer="TinyModel1", |
| top_k=None, |
| ) |
| text = "Your input text here." |
| print(clf(text)) |
| ``` |
|
|
| Use `top_k=None` (or your Transformers version’s equivalent) for scores for **all** labels. Replace `"TinyModel1"` with your Hub model id when loading from the Hub. |
|
|
| --- |
|
|
| ## Training data |
|
|
| - **Dataset:** `fancyzhx/ag_news` (text column mapped for training; see `artifact.json`). |
| - **Preprocessing:** tokenizer trained on training texts; sequences truncated to 128 tokens. |
|
|
| --- |
|
|
| ## Intended use |
|
|
| - Prototyping **routing**, **tagging**, and **dashboard** features over short text. |
| - Teaching and benchmarking small-classification setups. |
| - Starting point for **domain adaptation** with your own labels. |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - **Accuracy** is modest by design; validate on your data before high-stakes use. |
| - **Not a general-purpose language model** — classification head only; for generation use an LM. |
| - **Tokenizer and labels** are tied to this training run; mismatched inputs may degrade. |
|
|
| --- |
|
|
| ## License |
|
|
| This model is released under the **Apache 2.0** license (see repository `LICENSE` where applicable). |
|
|