finetuned_model / README.md

Update README.md

fe7ec08 verified 4 months ago

4.29 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: distilbert-base-uncased
	tags:
	- text-classification
	- transformers
	- distilbert
	- generated_from_trainer
	- cmu-course
	datasets:
	- ecopus/pgh_restaurants
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: Cuisine Classification (Fine-Tuned DistilBERT)
	results:
	- task:
	type: text-classification
	name: Multi-class Text Classification
	dataset:
	name: ecopus/pgh_restaurants
	type: classification
	split: augmented
	metrics:
	- type: accuracy
	value: 0.969
	- type: f1
	value: 0.957
	- type: precision
	value: 0.948
	- type: recall
	value: 0.969
	- task:
	type: text-classification
	name: Multi-class Text Classification
	dataset:
	name: ecopus/pgh_restaurants
	type: classification
	split: original
	metrics:
	- type: accuracy
	value: 0.94
	- type: f1
	value: 0.92
	---

	# Model Card for Cuisine Classification (Fine-Tuned DistilBERT)

	This model predicts the cuisine type of Pittsburgh restaurants based on review text.
	It was fine-tuned from [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the dataset [ecopus/pgh_restaurants](https://huggingface.co/datasets/ecopus/pgh_restaurants).

	It achieves the following results:
	- Evaluation (Augmented split): Accuracy 0.969, F1 0.957, Precision 0.948, Recall 0.969
	- External Validation (Original split): Accuracy 0.94, F1 0.92

	---

	## Model Details

	- Developed by: Xinxuan Tang (CMU)
	- Dataset curated by: Emily Copus (CMU)
	- Base model: DistilBERT (`distilbert-base-uncased`)
	- Library: Transformers
	- Language(s): English
	- License: apache-2.0 (dataset + model card)

	---

	## Uses

	### Direct Use
	- Educational practice in text classification.
	- Experimenting with fine-tuning compact transformers.

	### Downstream Use
	- Could be adapted for restaurant recommendation demos.
	- Teaching NLP pipelines for classification tasks.

	### Out-of-Scope Use
	- Not suitable for production deployment.
	- Not intended for sentiment analysis or tasks outside cuisine prediction.

	---

	## Training Procedure

	### Training Hyperparameters
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: AdamW (betas=(0.9,0.999), eps=1e-08)
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training Results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 \| Precision \| Recall \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:------:\|:---------:\|:------:\|
	\| 2.6677 \| 1.0 \| 80 \| 2.4746 \| 0.3563 \| 0.2142 \| 0.1662 \| 0.3563 \|
	\| 1.7201 \| 2.0 \| 160 \| 1.5893 \| 0.7750 \| 0.6895 \| 0.6644 \| 0.7750 \|
	\| 1.1994 \| 3.0 \| 240 \| 1.1417 \| 0.8938 \| 0.8503 \| 0.8180 \| 0.8938 \|
	\| 1.0890 \| 4.0 \| 320 \| 0.9315 \| 0.9250 \| 0.8959 \| 0.8784 \| 0.9250 \|
	\| 0.7052 \| 5.0 \| 400 \| 0.8675 \| 0.9688 \| 0.9570 \| 0.9480 \| 0.9688 \|

	---

	## Evaluation

	### Testing Data
	- Augmented split: 1000 reviews (synthetic augmentation)
	- Original split: 100 reviews (external validation)

	### Metrics
	- Accuracy, weighted F1, Precision, Recall
	- Confusion matrix used for external validation

	---

	## Framework Versions
	- Transformers: 4.56.1
	- PyTorch: 2.8.0+cu126
	- Datasets: 4.0.0
	- Tokenizers: 0.22.0

	---

	## Bias, Risks, and Limitations

	- Small dataset: only 100 original reviews.
	- Synthetic augmentation: may introduce artifacts.
	- Geographic bias: limited to Pittsburgh restaurants.

	### Recommendations
	Treat results as proof-of-concept, not production-ready.

	---

	## How to Get Started with the Model

	You can load this fine-tuned DistilBERT model directly from the Hugging Face Hub and run inference:

	```python
	from transformers import pipeline

	# Load the fine-tuned model
	classifier = pipeline("text-classification", model="YOUR_USERNAME/finetuned_model")

	# Example input
	sample = "This cozy little spot serves delicious tacos with great service!"
	print(classifier(sample))