rhdang commited on
Commit
881f6e9
·
verified ·
1 Parent(s): f7ac105

Create README.md

Browse files

# **Model Description**

This model predicts the star rating (1 - 5) of a Yelp review based on its text content. It was trained using **GPT-2** and **BERT**, with Bert achieving the best performance at **75% validation accuracy**. The model addresses class imbalance using weighted loss and optimizes hyperparameters to enhance generalization.

# **Training Details**

- **Dataset**: Yelp Reviews dataset (100,000 samples used)
- **Preprocessing**:
- **GPT-2 Tokenizer** with **Byte-Pair Encoding (BPE)** for rare words
- Truncation (128 tokens) and padding for uniform input size
- **Models Trained**:
- **GPT-2**: Fine-tuned with a custom classification head, achieving **67% validation accuracy**
- **BERT**: Fine-tuned with bidirectional attention, achieving **75% validation accuracy**
- **Loss Function**: **Weighted Cross Entropy** to counteract class imbalance.

# **Limitations**

- Performance may degrade on **highly informal or extremely short reviews**
- **Class imbalance** still affects predictions for underrepresented ratings
- Model was trained on **English-language** reviews only

Files changed (1) hide show
  1. README.md +11 -0
README.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - text-classification
8
+ - yelp-reviews
9
+ - gpt-2
10
+ - bert
11
+ ---