Update README.md
Browse files
README.md
CHANGED
|
@@ -3,69 +3,149 @@ library_name: transformers
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: distilbert-base-uncased
|
| 5 |
tags:
|
|
|
|
|
|
|
|
|
|
| 6 |
- generated_from_trainer
|
|
|
|
|
|
|
|
|
|
| 7 |
metrics:
|
| 8 |
- accuracy
|
| 9 |
- f1
|
| 10 |
- precision
|
| 11 |
- recall
|
| 12 |
model-index:
|
| 13 |
-
- name:
|
| 14 |
-
results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
-
|
| 18 |
-
should probably proofread and complete it, then remove this comment. -->
|
| 19 |
|
| 20 |
-
|
|
|
|
| 21 |
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
-
|
| 25 |
-
- Accuracy: 0.95
|
| 26 |
-
- F1: 0.9311
|
| 27 |
-
- Precision: 0.9197
|
| 28 |
-
- Recall: 0.95
|
| 29 |
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
-
|
| 33 |
|
| 34 |
-
##
|
| 35 |
|
| 36 |
-
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
|
|
|
|
|
|
|
| 41 |
|
| 42 |
-
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
-
|
| 47 |
-
- learning_rate
|
| 48 |
-
- train_batch_size
|
| 49 |
-
- eval_batch_size
|
| 50 |
-
- seed
|
| 51 |
-
- optimizer
|
| 52 |
-
- lr_scheduler_type
|
| 53 |
-
- num_epochs
|
| 54 |
|
| 55 |
-
### Training
|
| 56 |
|
| 57 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|
| 58 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
|
| 59 |
| 2.6677 | 1.0 | 80 | 2.4746 | 0.3563 | 0.2142 | 0.1662 | 0.3563 |
|
| 60 |
-
| 1.7201 | 2.0 | 160 | 1.5893 | 0.
|
| 61 |
| 1.1994 | 3.0 | 240 | 1.1417 | 0.8938 | 0.8503 | 0.8180 | 0.8938 |
|
| 62 |
-
| 1.
|
| 63 |
| 0.7052 | 5.0 | 400 | 0.8675 | 0.9688 | 0.9570 | 0.9480 | 0.9688 |
|
| 64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
-
|
|
|
|
| 67 |
|
| 68 |
-
- Transformers 4.56.1
|
| 69 |
-
- Pytorch 2.8.0+cu126
|
| 70 |
-
- Datasets 4.0.0
|
| 71 |
-
- Tokenizers 0.22.0
|
|
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: distilbert-base-uncased
|
| 5 |
tags:
|
| 6 |
+
- text-classification
|
| 7 |
+
- transformers
|
| 8 |
+
- distilbert
|
| 9 |
- generated_from_trainer
|
| 10 |
+
- cmu-course
|
| 11 |
+
datasets:
|
| 12 |
+
- ecopus/pgh_restaurants
|
| 13 |
metrics:
|
| 14 |
- accuracy
|
| 15 |
- f1
|
| 16 |
- precision
|
| 17 |
- recall
|
| 18 |
model-index:
|
| 19 |
+
- name: Cuisine Classification (Fine-Tuned DistilBERT)
|
| 20 |
+
results:
|
| 21 |
+
- task:
|
| 22 |
+
type: text-classification
|
| 23 |
+
name: Multi-class Text Classification
|
| 24 |
+
dataset:
|
| 25 |
+
name: ecopus/pgh_restaurants
|
| 26 |
+
type: classification
|
| 27 |
+
split: augmented
|
| 28 |
+
metrics:
|
| 29 |
+
- type: accuracy
|
| 30 |
+
value: 0.969
|
| 31 |
+
- type: f1
|
| 32 |
+
value: 0.957
|
| 33 |
+
- type: precision
|
| 34 |
+
value: 0.948
|
| 35 |
+
- type: recall
|
| 36 |
+
value: 0.969
|
| 37 |
+
- task:
|
| 38 |
+
type: text-classification
|
| 39 |
+
name: Multi-class Text Classification
|
| 40 |
+
dataset:
|
| 41 |
+
name: ecopus/pgh_restaurants
|
| 42 |
+
type: classification
|
| 43 |
+
split: original
|
| 44 |
+
metrics:
|
| 45 |
+
- type: accuracy
|
| 46 |
+
value: 0.94
|
| 47 |
+
- type: f1
|
| 48 |
+
value: 0.92
|
| 49 |
---
|
| 50 |
|
| 51 |
+
# Model Card for Cuisine Classification (Fine-Tuned DistilBERT)
|
|
|
|
| 52 |
|
| 53 |
+
This model predicts the **cuisine type** of Pittsburgh restaurants based on review text.
|
| 54 |
+
It was fine-tuned from [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the dataset [ecopus/pgh_restaurants](https://huggingface.co/datasets/ecopus/pgh_restaurants).
|
| 55 |
|
| 56 |
+
It achieves the following results:
|
| 57 |
+
- **Evaluation (Augmented split):** Accuracy 0.969, F1 0.957, Precision 0.948, Recall 0.969
|
| 58 |
+
- **External Validation (Original split):** Accuracy 0.94, F1 0.92
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
+
---
|
| 61 |
+
|
| 62 |
+
## Model Details
|
| 63 |
+
|
| 64 |
+
- **Developed by:** Xinxuan Tang (CMU)
|
| 65 |
+
- **Dataset curated by:** Emily Copus (CMU)
|
| 66 |
+
- **Base model:** DistilBERT (`distilbert-base-uncased`)
|
| 67 |
+
- **Library:** 🤗 Transformers
|
| 68 |
+
- **Language(s):** English
|
| 69 |
+
- **License:** apache-2.0 (dataset + model card)
|
| 70 |
|
| 71 |
+
---
|
| 72 |
|
| 73 |
+
## Uses
|
| 74 |
|
| 75 |
+
### Direct Use
|
| 76 |
+
- Educational practice in **text classification**.
|
| 77 |
+
- Experimenting with **fine-tuning compact transformers**.
|
| 78 |
|
| 79 |
+
### Downstream Use
|
| 80 |
+
- Could be adapted for **restaurant recommendation demos**.
|
| 81 |
+
- Teaching **NLP pipelines** for classification tasks.
|
| 82 |
|
| 83 |
+
### Out-of-Scope Use
|
| 84 |
+
- Not suitable for **production deployment**.
|
| 85 |
+
- Not intended for **sentiment analysis** or tasks outside cuisine prediction.
|
| 86 |
|
| 87 |
+
---
|
| 88 |
|
| 89 |
+
## Training Procedure
|
| 90 |
|
| 91 |
+
### Training Hyperparameters
|
| 92 |
+
- **learning_rate:** 2e-05
|
| 93 |
+
- **train_batch_size:** 8
|
| 94 |
+
- **eval_batch_size:** 8
|
| 95 |
+
- **seed:** 42
|
| 96 |
+
- **optimizer:** AdamW (betas=(0.9,0.999), eps=1e-08)
|
| 97 |
+
- **lr_scheduler_type:** linear
|
| 98 |
+
- **num_epochs:** 5
|
| 99 |
|
| 100 |
+
### Training Results
|
| 101 |
|
| 102 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|
| 103 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
|
| 104 |
| 2.6677 | 1.0 | 80 | 2.4746 | 0.3563 | 0.2142 | 0.1662 | 0.3563 |
|
| 105 |
+
| 1.7201 | 2.0 | 160 | 1.5893 | 0.7750 | 0.6895 | 0.6644 | 0.7750 |
|
| 106 |
| 1.1994 | 3.0 | 240 | 1.1417 | 0.8938 | 0.8503 | 0.8180 | 0.8938 |
|
| 107 |
+
| 1.0890 | 4.0 | 320 | 0.9315 | 0.9250 | 0.8959 | 0.8784 | 0.9250 |
|
| 108 |
| 0.7052 | 5.0 | 400 | 0.8675 | 0.9688 | 0.9570 | 0.9480 | 0.9688 |
|
| 109 |
|
| 110 |
+
---
|
| 111 |
+
|
| 112 |
+
## Evaluation
|
| 113 |
+
|
| 114 |
+
### Testing Data
|
| 115 |
+
- **Augmented split:** 1000 reviews (synthetic augmentation)
|
| 116 |
+
- **Original split:** 100 reviews (external validation)
|
| 117 |
+
|
| 118 |
+
### Metrics
|
| 119 |
+
- Accuracy, weighted F1, Precision, Recall
|
| 120 |
+
- Confusion matrix used for external validation
|
| 121 |
+
|
| 122 |
+
---
|
| 123 |
+
|
| 124 |
+
## Framework Versions
|
| 125 |
+
- **Transformers:** 4.56.1
|
| 126 |
+
- **PyTorch:** 2.8.0+cu126
|
| 127 |
+
- **Datasets:** 4.0.0
|
| 128 |
+
- **Tokenizers:** 0.22.0
|
| 129 |
+
|
| 130 |
+
---
|
| 131 |
+
|
| 132 |
+
## Bias, Risks, and Limitations
|
| 133 |
+
|
| 134 |
+
- **Small dataset**: only 100 original reviews.
|
| 135 |
+
- **Synthetic augmentation**: may introduce artifacts.
|
| 136 |
+
- **Geographic bias**: limited to Pittsburgh restaurants.
|
| 137 |
+
|
| 138 |
+
### Recommendations
|
| 139 |
+
Treat results as **proof-of-concept**, not production-ready.
|
| 140 |
+
|
| 141 |
+
---
|
| 142 |
+
|
| 143 |
+
## Citation
|
| 144 |
+
|
| 145 |
+
If you use this model, please cite the dataset and Hugging Face tools.
|
| 146 |
+
|
| 147 |
+
---
|
| 148 |
|
| 149 |
+
## Model Card Contact
|
| 150 |
+
Xinxuan Tang — xinxuant@andrew.cmu.edu
|
| 151 |
|
|
|
|
|
|
|
|
|
|
|
|