Commit
·
82b248b
1
Parent(s):
f2f8351
Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,8 @@ Model Description
|
|
| 16 |
This model is based on the BERT (Bidirectional Encoder Representations from Transformers) model, specifically bert-base-uncased.
|
| 17 |
|
| 18 |
Training Procedure
|
| 19 |
-
The model was trained on the TripAdvisor hotel reviews dataset. Each review in the dataset is associated with a rating from 1 to 5.
|
|
|
|
| 20 |
|
| 21 |
Ratings of 1 and 2 were labelled as 'Negative'
|
| 22 |
Rating of 3 was labelled as 'Neutral'
|
|
@@ -26,17 +27,16 @@ The text of each review was preprocessed by lowercasing, removing punctuation, e
|
|
| 26 |
The model was trained with a learning rate of 2e-5, an epsilon of 1e-8, and a batch size of 6 for 5 epochs.
|
| 27 |
|
| 28 |
Evaluation
|
| 29 |
-
The model was evaluated using a weighted F1 score.
|
| 30 |
|
| 31 |
Usage
|
| 32 |
To use the model, load it and use it to classify a review. For example:
|
| 33 |
|
| 34 |
-
|
| 35 |
-
Copy code
|
| 36 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 37 |
|
| 38 |
-
tokenizer = AutoTokenizer.from_pretrained("<
|
| 39 |
-
model = AutoModelForSequenceClassification.from_pretrained("<
|
| 40 |
|
| 41 |
text = "The hotel was great and the staff were very friendly."
|
| 42 |
|
|
@@ -45,10 +45,7 @@ output = model(**encoded_input)
|
|
| 45 |
predictions = output.logits.argmax(dim=1)
|
| 46 |
|
| 47 |
print(predictions)
|
| 48 |
-
Replace <model-name> with the actual model name.
|
| 49 |
|
| 50 |
Limitations and Bias
|
| 51 |
-
The model is trained on English data, so it might not perform well on reviews in other languages.
|
| 52 |
-
|
| 53 |
-
Licensing
|
| 54 |
-
Please add licensing information here if applicable.
|
|
|
|
| 16 |
This model is based on the BERT (Bidirectional Encoder Representations from Transformers) model, specifically bert-base-uncased.
|
| 17 |
|
| 18 |
Training Procedure
|
| 19 |
+
The model was trained on the TripAdvisor hotel reviews dataset. Each review in the dataset is associated with a rating from 1 to 5.
|
| 20 |
+
The ratings were converted to sentiment labels as follows:
|
| 21 |
|
| 22 |
Ratings of 1 and 2 were labelled as 'Negative'
|
| 23 |
Rating of 3 was labelled as 'Neutral'
|
|
|
|
| 27 |
The model was trained with a learning rate of 2e-5, an epsilon of 1e-8, and a batch size of 6 for 5 epochs.
|
| 28 |
|
| 29 |
Evaluation
|
| 30 |
+
The model was evaluated using a weighted F1 score.
|
| 31 |
|
| 32 |
Usage
|
| 33 |
To use the model, load it and use it to classify a review. For example:
|
| 34 |
|
| 35 |
+
|
|
|
|
| 36 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 37 |
|
| 38 |
+
tokenizer = AutoTokenizer.from_pretrained("<Group209>")
|
| 39 |
+
model = AutoModelForSequenceClassification.from_pretrained("<Group209>")
|
| 40 |
|
| 41 |
text = "The hotel was great and the staff were very friendly."
|
| 42 |
|
|
|
|
| 45 |
predictions = output.logits.argmax(dim=1)
|
| 46 |
|
| 47 |
print(predictions)
|
|
|
|
| 48 |
|
| 49 |
Limitations and Bias
|
| 50 |
+
The model is trained on English data, so it might not perform well on reviews in other languages.
|
| 51 |
+
Furthermore, it might be biased towards certain phrases or words that are commonly used in the training dataset.
|
|
|
|
|
|