Hebrew Grocery Item Classifier v2

Fine-tuned multilingual-MiniLMv2-L6-mnli-xnli for classifying Hebrew grocery shopping list items into 12 supermarket categories. This is an improved version over v1 with a larger, more diverse training set and targeted augmentation for weak classes. finetune code at https://github.com/davidit17/shopping_list_bot

What's new in v2

  • Larger dataset โ€” 3043 total items (2434 train / 609 test), up from ~600 in v1
  • 4 training sources โ€” manual lists, Excel grocery data, Gemini-labelled web data, and a ChatGPT-labelled set
  • Synthetic augmentation โ€” rule-based Hebrew item generator to balance under-represented categories
  • Weak-class booster โ€” extra hand-curated examples for ืžืฉืงืื•ืช, ืงืคื•ืื™ื ื•ืกืœื˜ื™ื, and ืจื˜ื‘ื™ื ื•ืžืžืจื—ื™ื
  • Per-class F1 tracking โ€” evaluation now reports per-category F1 to surface weak spots

Categories

  • ืžื•ืฆืจื™ ื—ืœื‘ ื•ื‘ื™ืฆื™ื
  • ื™ืจืงื•ืช ื•ืคื™ืจื•ืช
  • ืœื—ื ืžืืคื™ื ื•ื“ื’ื ื™ื
  • ื—ื˜ื™ืคื™ื ื•ืžืชื•ืงื™ื
  • ื ื™ืงื™ื•ืŸ ื˜ื™ืคื•ื— ื•ื—ื“ ืคืขืžื™
  • ืžืฉืงืื•ืช
  • ืžื•ืฆืจื™ื ืœืืคื™ื™ื”
  • ื‘ืฉืจ ื•ื“ื’ื™ื
  • ืงืคื•ืื™ื ื•ืกืœื˜ื™ื
  • ื™ื‘ืฉื™ื
  • ืจื˜ื‘ื™ื ื•ืžืžืจื—ื™ื
  • ืื—ืจ

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model=f"davidit17/minilm-grocery-hebrew-v2",
)

items = ["ื’ื‘ื™ื ื” ืฆื”ื•ื‘ื”", "ืขื’ื‘ื ื™ื•ืช ืฉืจื™", "ืฉืžืคื•", "ืงื•ืงื” ืงื•ืœื”", "ื—ื–ื” ืขื•ืฃ"]
for item in items:
    result = classifier(item)[0]
    print(item, "โ†’", result["label"], f"({result['score']:.2f})")

With confidence threshold (recommended)

CONFIDENCE_THRESHOLD = 0.5

def predict(text: str) -> str:
    result = classifier(text)[0]
    return result["label"] if result["score"] >= CONFIDENCE_THRESHOLD else "ืื—ืจ"

Training

Parameter Value
Base model multilingual-MiniLMv2-L6-mnli-xnli
Max epochs 50 (early stopping, patience=8)
Learning rate 2e-5
Weight decay 0.01
Warmup ratio 0.1
Batch size 16
Max sequence length 32
Train / test split 80% / 20% stratified
Hardware GPU (CUDA)

Data sources

Source Description
Source A Manual Hebrew grocery lists from Israeli forum
Source B Excel-format grocery list, cleaned and melted
Source C Web grocery list labelled with Gemini
Source D ChatGPT-labelled items across all 12 categories
Source E Synthetic items generated with rule-based Hebrew augmentor
Source F Hand-curated booster set for weak categories

Evaluation

Overall

Metric Base (zero-shot) Fine-tuned v2
Accuracy 0.1494 0.8818
Weighted F1 0.1363 0.8808

Per-class F1 (fine-tuned v2)

Category F1
ืžื•ืฆืจื™ ื—ืœื‘ ื•ื‘ื™ืฆื™ื 0.9057
ื™ืจืงื•ืช ื•ืคื™ืจื•ืช 0.8696
ืœื—ื ืžืืคื™ื ื•ื“ื’ื ื™ื 0.8767
ื—ื˜ื™ืคื™ื ื•ืžืชื•ืงื™ื 0.9091
ื ื™ืงื™ื•ืŸ ื˜ื™ืคื•ื— ื•ื—ื“ ืคืขืžื™ 0.9020
ืžืฉืงืื•ืช 0.8750
ืžื•ืฆืจื™ื ืœืืคื™ื™ื” 0.8687
ื‘ืฉืจ ื•ื“ื’ื™ื 0.9333
ืงืคื•ืื™ื ื•ืกืœื˜ื™ื 0.8632
ื™ื‘ืฉื™ื 0.8000
ืจื˜ื‘ื™ื ื•ืžืžืจื—ื™ื 0.8913
ืื—ืจ 0.9024

Limitations

  • Primarily covers Israeli supermarket conventions and Hebrew product naming.
  • Brand-specific or very niche items may fall back to ืื—ืจ.
  • Low-confidence predictions (score < 0.5) should be treated as ืื—ืจ.
  • Synthetic training examples may not fully reflect natural shopping list variation.
  • The ืื—ืจ category is a catch-all and may have lower precision than domain-specific categories.
Downloads last month
59
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for davidit17/minilm-grocery-hebrew-v2

Finetuned
(2)
this model