product-classifier-model-B2 / README.md

MartinB77

Update README.md

d217128 verified 9 months ago

preview code

raw

history blame contribute delete

2.49 kB

metadata

language: en
license: apache-2.0
tags:
  - product-classification
  - transformers
  - pytorch
  - distilbert
datasets:
  - lokeshparab/amazon-products-dataset
model-index:
  - name: Product Classifier B2
    results: []

Product Classifier B2

Tento model slouží k predikci kategorií produktů na základě jejich názvu nebo popisu...

🏍️ Amazon Product Classifier (Balanced B2)

This is a fine-tuned DistilBERT model for multi-class classification of product titles into Amazon-like product categories.
The model is based on distilbert-base-uncased and was trained on a balanced subset of the Amazon Products dataset.

🧠 Model Architecture

Base: distilbert-base-uncased (6-layer, 768 hidden size)
Classification Head: 2 dense layers with dropout + ReLU
Output: softmax over 19 product categories

📊 Training Data

The model was trained on a balanced subset (≈40k samples) of the Amazon Products Dataset, which contains product titles and their corresponding categories.

Preprocessing included:

Removing empty/missing titles
Keeping top-level categories only
Balancing the dataset to avoid category bias

🧪 Example Usage (Python)

from transformers import pipeline

classifier = pipeline("text-classification", model="your-username/product-classifier-model-B2")

result = classifier("Smartwatch with heart rate monitor and GPS tracking")
print(result)
# [{'label': 'stores', 'score': 0.94}]

🚀 Intended Use

The model is designed to help developers quickly classify product titles into e-commerce categories, useful for:

Auto-tagging items in online stores
Cleaning and organizing product catalogs
Building recommendation engines (in combination with embeddings)

📌 Limitations

English-only (trained on distilbert-base-uncased)
May not perform well on very short or ambiguous product names
Not suitable for legal/medical/financial applications

📄 License & Source

Model: MIT License
Training Data: Amazon Products Dataset on Kaggle
(check license and attribution requirements on Kaggle page)

MartinB77
/

product-classifier-model-B2