Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

README.md +65 -0
config.json +6 -0
model.safetensors +3 -0
tokenizer.json +0 -0
tokenizer_config.json +14 -0
training_meta.json +13 -0

README.md ADDED Viewed

	@@ -0,0 +1,65 @@

+---
+language:
+  - en
+license: mit
+library_name: transformers
+pipeline_tag: zero-shot-classification
+tags:
+  - zero-shot
+  - multi-label
+  - text-classification
+  - pytorch
+metrics:
+  - precision
+  - recall
+  - f1
+base_model: bert-base-uncased
+datasets:
+  - polodealvarado/zeroshot-classification
+---
+# Zero-Shot Text Classification — spanclass
+GLiNER-inspired span-attentive classification with top-K span selection.
+This model encodes texts and candidate labels into a shared embedding space using BERT,
+enabling classification into arbitrary categories without retraining for new labels.
+## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base model | `bert-base-uncased` |
+| Model variant | `spanclass` |
+| Training steps | 1000 |
+| Batch size | 2 |
+| Learning rate | 2e-05 |
+| Trainable params | 111,254,017 |
+| Training time | 374.1s |
+## Dataset
+Trained on [polodealvarado/zeroshot-classification](https://huggingface.co/datasets/polodealvarado/zeroshot-classification).
+## Evaluation Results
+| Metric | Score |
+|--------|-------|
+| Precision | 0.9277 |
+| Recall | 0.9503 |
+| F1 Score | 0.9388 |
+## Usage
+```python
+from models.spanclass import SpanClassModel
+model = SpanClassModel.from_pretrained("polodealvarado/spanclass")
+predictions = model.predict(
+    texts=["The stock market crashed yesterday."],
+    labels=[["Finance", "Sports", "Biology", "Economy"]],
+)
+print(predictions)
+# [{"text": "...", "scores": {"Finance": 0.98, "Economy": 0.85, ...}}]
+```

config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "max_num_labels": 13,
+  "max_span_width": 5,
+  "model_name": "bert-base-uncased",
+  "top_k_spans": 8
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6f86de12cb5dbe2875e166b813232be74cf257168009d6c59aaa0eecf13f8650
+size 445041916

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "backend": "tokenizers",
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "is_local": false,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

training_meta.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "model_type": "spanclass",
+  "encoder_name": "bert-base-uncased",
+  "param_count": 111254017,
+  "num_steps": 1000,
+  "best_step": 875,
+  "batch_size": 2,
+  "learning_rate": 2e-05,
+  "train_time_s": 374.11,
+  "precision": 0.9277,
+  "recall": 0.9503,
+  "f1": 0.9388
+}