Deploy TinyModel1 from GitHub Actions

6df69f6 verified 25 minutes ago

3.73 kB

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-classification
	datasets:
	- fancyzhx/ag_news
	language:
	- en
	tags:
	- tiny
	- bert
	- text-classification
	---

	<div align="center">
	<img src="TinyModel1Image.png" alt="TinyModel1" style="max-width: 100%; width: 100%; height: auto; display: block;" />
	</div>

	# TinyModel1

	TinyModel1 is a compact encoder model for news topic classification, trained on the AG News dataset. It targets fast CPU/GPU inference and use as a baseline.

	## Links

	- Source code (train & export): [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
	- Live demo (Space): [TinyModel1Space](https://huggingface.co/spaces/HyperlinksSpace/TinyModel1Space) (canonical Hub URL; avoids unreliable `*.hf.space` links)


	---

	## Model summary

	\| Field \| Value \|
	\|:--\|:--\|
	\| Task \| Text classification (single-label, 4 classes) \|
	\| Labels \| World, Sports, Business, Sci/Tech \|
	\| Dataset \| `fancyzhx/ag_news` \|
	\| Architecture \| Tiny BERT-style encoder (`BertForSequenceClassification`) \|
	\| Parameters \| 1,339,268 (~1.34M) \|
	\| Max sequence length \| 128 tokens (training & inference) \|
	\| Framework \| [Transformers](https://github.com/huggingface/transformers) · Safetensors \|

	---

	## Model overview

	Trained with a WordPiece tokenizer fit on the training split and a shallow BERT stack. Replace the dataset and labels via `scripts/train_tinymodel1_classifier.py` for your own taxonomy.

	### Core capabilities

	- Text routing — assign one class per input for search, feeds, or triage.
	- Low latency — small parameter count suits edge and serverless setups.
	- Fine-tuning base — swap labels or data for your domain while keeping the same architecture.

	---

	## Training

	\| Setting \| Value \|
	\|:--\|:--\|
	\| Train samples (cap) \| 3000 \|
	\| Eval samples (cap) \| 600 \|
	\| Epochs \| 2 \|
	\| Batch size \| 16 \|
	\| Learning rate \| 0.0001 \|
	\| Optimizer \| AdamW \|

	---

	## Evaluation

	\| Metric \| Value \|
	\|:--\|:--\|
	\| Accuracy \| 0.5150 \|
	\| Macro F1 \| 0.4262 \|
	\| Weighted F1 \| 0.4240 \|
	\| Final train loss \| 1.1535 \|

	Per-class F1 and the confusion matrix are saved in `eval_report.json` in this model directory.

	Metrics are computed on the held-out eval subset (see `eval_report.json` → `reproducibility`); treat them as a sanity-check baseline, not a production SLA.

	---

	## Getting started

	### Inference with `transformers`

	```python
	from transformers import pipeline

	clf = pipeline(
	"text-classification",
	model="TinyModel1",
	tokenizer="TinyModel1",
	top_k=None,
	)
	text = "Your input text here."
	print(clf(text))
	```

	Use `top_k=None` (or your Transformers version’s equivalent) for scores for all labels. Replace `"TinyModel1"` with your Hub model id when loading from the Hub.

	---

	## Training data

	- Dataset: `fancyzhx/ag_news` (text column mapped for training; see `artifact.json`).
	- Preprocessing: tokenizer trained on training texts; sequences truncated to 128 tokens.

	---

	## Intended use

	- Prototyping routing, tagging, and dashboard features over short text.
	- Teaching and benchmarking small-classification setups.
	- Starting point for domain adaptation with your own labels.

	---

	## Limitations

	- Accuracy is modest by design; validate on your data before high-stakes use.
	- Not a general-purpose language model — classification head only; for generation use an LM.
	- Tokenizer and labels are tied to this training run; mismatched inputs may degrade.

	---

	## License

	This model is released under the Apache 2.0 license (see repository `LICENSE` where applicable).