finetuned_model / README.md
Iris314's picture
Update README.md
fe7ec08 verified
---
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- text-classification
- transformers
- distilbert
- generated_from_trainer
- cmu-course
datasets:
- ecopus/pgh_restaurants
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: Cuisine Classification (Fine-Tuned DistilBERT)
results:
- task:
type: text-classification
name: Multi-class Text Classification
dataset:
name: ecopus/pgh_restaurants
type: classification
split: augmented
metrics:
- type: accuracy
value: 0.969
- type: f1
value: 0.957
- type: precision
value: 0.948
- type: recall
value: 0.969
- task:
type: text-classification
name: Multi-class Text Classification
dataset:
name: ecopus/pgh_restaurants
type: classification
split: original
metrics:
- type: accuracy
value: 0.94
- type: f1
value: 0.92
---
# Model Card for Cuisine Classification (Fine-Tuned DistilBERT)
This model predicts the **cuisine type** of Pittsburgh restaurants based on review text.
It was fine-tuned from [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the dataset [ecopus/pgh_restaurants](https://huggingface.co/datasets/ecopus/pgh_restaurants).
It achieves the following results:
- **Evaluation (Augmented split):** Accuracy 0.969, F1 0.957, Precision 0.948, Recall 0.969
- **External Validation (Original split):** Accuracy 0.94, F1 0.92
---
## Model Details
- **Developed by:** Xinxuan Tang (CMU)
- **Dataset curated by:** Emily Copus (CMU)
- **Base model:** DistilBERT (`distilbert-base-uncased`)
- **Library:** Transformers
- **Language(s):** English
- **License:** apache-2.0 (dataset + model card)
---
## Uses
### Direct Use
- Educational practice in **text classification**.
- Experimenting with **fine-tuning compact transformers**.
### Downstream Use
- Could be adapted for **restaurant recommendation demos**.
- Teaching **NLP pipelines** for classification tasks.
### Out-of-Scope Use
- Not suitable for **production deployment**.
- Not intended for **sentiment analysis** or tasks outside cuisine prediction.
---
## Training Procedure
### Training Hyperparameters
- **learning_rate:** 2e-05
- **train_batch_size:** 8
- **eval_batch_size:** 8
- **seed:** 42
- **optimizer:** AdamW (betas=(0.9,0.999), eps=1e-08)
- **lr_scheduler_type:** linear
- **num_epochs:** 5
### Training Results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
| 2.6677 | 1.0 | 80 | 2.4746 | 0.3563 | 0.2142 | 0.1662 | 0.3563 |
| 1.7201 | 2.0 | 160 | 1.5893 | 0.7750 | 0.6895 | 0.6644 | 0.7750 |
| 1.1994 | 3.0 | 240 | 1.1417 | 0.8938 | 0.8503 | 0.8180 | 0.8938 |
| 1.0890 | 4.0 | 320 | 0.9315 | 0.9250 | 0.8959 | 0.8784 | 0.9250 |
| 0.7052 | 5.0 | 400 | 0.8675 | 0.9688 | 0.9570 | 0.9480 | 0.9688 |
---
## Evaluation
### Testing Data
- **Augmented split:** 1000 reviews (synthetic augmentation)
- **Original split:** 100 reviews (external validation)
### Metrics
- Accuracy, weighted F1, Precision, Recall
- Confusion matrix used for external validation
---
## Framework Versions
- **Transformers:** 4.56.1
- **PyTorch:** 2.8.0+cu126
- **Datasets:** 4.0.0
- **Tokenizers:** 0.22.0
---
## Bias, Risks, and Limitations
- **Small dataset**: only 100 original reviews.
- **Synthetic augmentation**: may introduce artifacts.
- **Geographic bias**: limited to Pittsburgh restaurants.
### Recommendations
Treat results as **proof-of-concept**, not production-ready.
---
## How to Get Started with the Model
You can load this fine-tuned DistilBERT model directly from the Hugging Face Hub and run inference:
```python
from transformers import pipeline
# Load the fine-tuned model
classifier = pipeline("text-classification", model="YOUR_USERNAME/finetuned_model")
# Example input
sample = "This cozy little spot serves delicious tacos with great service!"
print(classifier(sample))