Floressek
/

sentiment_classification_from_distillbert

Text Classification

sentiment-analysis

text-embeddings-inference

Model card Files Files and versions

Floressek commited on Nov 14, 2025

Commit

3fbb5e3

·

verified ·

1 Parent(s): 4c819e2

Update README.md

Files changed (1) hide show

README.md +66 -3

README.md CHANGED Viewed

@@ -1,12 +1,75 @@
 ---
-license: apache-2.0
 language:
 - en
 metrics:
 - accuracy
 - f1
 - recall
 - precision
-base_model:
-- distilbert/distilbert-base-uncased
 ---

 ---
 language:
 - en
+license: apache-2.0
+tags:
+- text-classification
+- sentiment-analysis
+- distilbert
+- transformers
+pipeline_tag: text-classification
+library_name: transformers
+datasets:
+- Amazon_Unlocked_Mobile
+base_model: distilbert-base-uncased
 metrics:
 - accuracy
 - f1
 - recall
 - precision
+widget:
+- text: "Great handset! Works flawlessly."
+- text: "Terrible product, waste of money."
+---
+# DistilBERT for Binary Sentiment Classification
+Lightweight sentiment classifier fine-tuned from `distilbert-base-uncased` to predict sentiment (negative vs. positive) for short English product reviews. Trained on a filtered subset of the Amazon Unlocked Mobile dataset.
+## Model Details
+- Base model: `distilbert-base-uncased`
+- Task: binary sentiment classification
+- Labels: `0 -> negative` (rating 1), `1 -> positive` (rating 5)
+- Max input length: 128 tokens
+- Tokenizer: `AutoTokenizer` for the same checkpoint
+- Mixed precision: fp16 (when CUDA available)
+## Intended Use and Limitations
+- Use for short English product-review style texts.
+- Binary only (negative/positive). Not suited for nuanced or multi-class sentiment.
+- Not for safety-critical decisions or content moderation on its own.
+## Dataset and Preprocessing
+- Source: Amazon Unlocked Mobile (`Amazon_Unlocked_Mobile.csv`)
+- Filtering: keep rows where `Rating ∈ {1, 5}`; drop unrelated columns
+- Tokenization: padding to max length, truncation at 128
+- Split: train/test with `test_size = 0.3`, `seed = 100`
+## Training Configuration
+- Optimizer and schedule: handled by `transformers.Trainer`
+- Learning rate: `2e-5`
+- Batch size: `48` (train/eval per device)
+- Epochs: `2`
+- Weight decay: `0.01`
+- Save/eval strategy: `epoch`
+- Push to Hub: enabled
+## Evaluation
+Computed with `accuracy` and `f1` on the held-out test split. See the repository "Files and versions" / "Training metrics" tabs for run artifacts and exact scores.
+## How to Use
+Python (Transformers pipeline):
+```python
+from transformers import pipeline
+clf = pipeline(
+    "text-classification",
+    model="Floressek/sentiment_classification_from_distillbert",
+    top_k=None  # returns single label with score
+)
+print(clf("Great handset!"))
+print(clf("Shame. I wish I hadn't bought it."))
 ---