|
|
--- |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
base_model: distilbert-base-uncased |
|
|
tags: |
|
|
- text-classification |
|
|
- transformers |
|
|
- distilbert |
|
|
- generated_from_trainer |
|
|
- cmu-course |
|
|
datasets: |
|
|
- ecopus/pgh_restaurants |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
- precision |
|
|
- recall |
|
|
model-index: |
|
|
- name: Cuisine Classification (Fine-Tuned DistilBERT) |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Multi-class Text Classification |
|
|
dataset: |
|
|
name: ecopus/pgh_restaurants |
|
|
type: classification |
|
|
split: augmented |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.969 |
|
|
- type: f1 |
|
|
value: 0.957 |
|
|
- type: precision |
|
|
value: 0.948 |
|
|
- type: recall |
|
|
value: 0.969 |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Multi-class Text Classification |
|
|
dataset: |
|
|
name: ecopus/pgh_restaurants |
|
|
type: classification |
|
|
split: original |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.94 |
|
|
- type: f1 |
|
|
value: 0.92 |
|
|
--- |
|
|
|
|
|
# Model Card for Cuisine Classification (Fine-Tuned DistilBERT) |
|
|
|
|
|
This model predicts the **cuisine type** of Pittsburgh restaurants based on review text. |
|
|
It was fine-tuned from [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the dataset [ecopus/pgh_restaurants](https://huggingface.co/datasets/ecopus/pgh_restaurants). |
|
|
|
|
|
It achieves the following results: |
|
|
- **Evaluation (Augmented split):** Accuracy 0.969, F1 0.957, Precision 0.948, Recall 0.969 |
|
|
- **External Validation (Original split):** Accuracy 0.94, F1 0.92 |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Developed by:** Xinxuan Tang (CMU) |
|
|
- **Dataset curated by:** Emily Copus (CMU) |
|
|
- **Base model:** DistilBERT (`distilbert-base-uncased`) |
|
|
- **Library:** Transformers |
|
|
- **Language(s):** English |
|
|
- **License:** apache-2.0 (dataset + model card) |
|
|
|
|
|
--- |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
- Educational practice in **text classification**. |
|
|
- Experimenting with **fine-tuning compact transformers**. |
|
|
|
|
|
### Downstream Use |
|
|
- Could be adapted for **restaurant recommendation demos**. |
|
|
- Teaching **NLP pipelines** for classification tasks. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
- Not suitable for **production deployment**. |
|
|
- Not intended for **sentiment analysis** or tasks outside cuisine prediction. |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
### Training Hyperparameters |
|
|
- **learning_rate:** 2e-05 |
|
|
- **train_batch_size:** 8 |
|
|
- **eval_batch_size:** 8 |
|
|
- **seed:** 42 |
|
|
- **optimizer:** AdamW (betas=(0.9,0.999), eps=1e-08) |
|
|
- **lr_scheduler_type:** linear |
|
|
- **num_epochs:** 5 |
|
|
|
|
|
### Training Results |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | |
|
|
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:| |
|
|
| 2.6677 | 1.0 | 80 | 2.4746 | 0.3563 | 0.2142 | 0.1662 | 0.3563 | |
|
|
| 1.7201 | 2.0 | 160 | 1.5893 | 0.7750 | 0.6895 | 0.6644 | 0.7750 | |
|
|
| 1.1994 | 3.0 | 240 | 1.1417 | 0.8938 | 0.8503 | 0.8180 | 0.8938 | |
|
|
| 1.0890 | 4.0 | 320 | 0.9315 | 0.9250 | 0.8959 | 0.8784 | 0.9250 | |
|
|
| 0.7052 | 5.0 | 400 | 0.8675 | 0.9688 | 0.9570 | 0.9480 | 0.9688 | |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data |
|
|
- **Augmented split:** 1000 reviews (synthetic augmentation) |
|
|
- **Original split:** 100 reviews (external validation) |
|
|
|
|
|
### Metrics |
|
|
- Accuracy, weighted F1, Precision, Recall |
|
|
- Confusion matrix used for external validation |
|
|
|
|
|
--- |
|
|
|
|
|
## Framework Versions |
|
|
- **Transformers:** 4.56.1 |
|
|
- **PyTorch:** 2.8.0+cu126 |
|
|
- **Datasets:** 4.0.0 |
|
|
- **Tokenizers:** 0.22.0 |
|
|
|
|
|
--- |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
- **Small dataset**: only 100 original reviews. |
|
|
- **Synthetic augmentation**: may introduce artifacts. |
|
|
- **Geographic bias**: limited to Pittsburgh restaurants. |
|
|
|
|
|
### Recommendations |
|
|
Treat results as **proof-of-concept**, not production-ready. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
You can load this fine-tuned DistilBERT model directly from the Hugging Face Hub and run inference: |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
# Load the fine-tuned model |
|
|
classifier = pipeline("text-classification", model="YOUR_USERNAME/finetuned_model") |
|
|
|
|
|
# Example input |
|
|
sample = "This cozy little spot serves delicious tacos with great service!" |
|
|
print(classifier(sample)) |
|
|
|
|
|
|
|
|
|