panigrah
/

wineberto-ner

Token Classification

Model card Files Files and versions

panigrah commited on Nov 10, 2023

Commit

c0b4486

·

1 Parent(s): d5dff4b

Update README.md

Files changed (1) hide show

README.md +70 -0

README.md CHANGED Viewed

@@ -1,3 +1,73 @@
 ---
 license: unknown
 ---

 ---
 license: unknown
+pipeline_tag: token-classification
+tags:
+- wine
+- ner
 ---
+# Wineberto ner model
+Pretrained model on on wine labels and descriptions for named entity recognition that uses bert-base-uncased as the base model.
+## Model description
+## How to use
+You can use this model directly for named entity recognition like so
+```python
+>>> from transformers import pipeline
+>>> ner = pipeline('ner', model='winberto-ner-uncased')
+>>> tokens = ner('"Heitz Cabernet Sauvignon California Napa Valley Napa US this tremendous 100% varietal wine hails from oakville and was aged over three years in oak. juicy red-cherry fruit and a compelling hint of caramel greet the palate, framed by elegant, fine tannins and a subtle minty tone in the background. balanced and rewarding from start to finish, it has years ahead of it to develop further nuance. enjoy 2022"')
+>>> for t in toks:
+>>>    print(f"{t['word']}: {t['entity_group']}: {t['score']:.5}")
+heitz: producer: 0.99988
+cab: wine: 0.9999
+##ernet sauvignon: wine: 0.95893
+california: province: 0.99992
+napa valley: region: 0.99991
+napa: subregion: 0.99987
+us: country: 0.99996
+oak: flavor: 0.99992
+juicy: mouthfeel: 0.99992
+cherry: flavor: 0.99994
+fruit: flavor: 0.99994
+cara: flavor: 0.99993
+##mel: flavor: 0.99731
+mint: flavor: 0.99994
+balanced: mouthfeel: 0.99992
+```
+## Training data
+The BERT model was trained on 20K reviews and wine labels derived from https://huggingface.co/datasets/james-burton/wine_reviews_all_text and manually annotated to capture the following tokens
+```
+"1": "classification",
+"2": "country",
+"3": "flavor",
+"4": "mouthfeel",
+"5": "producer",
+"6": "province",
+"7": "region",
+"8": "subregion",
+"9": "wine"
+```
+## Training procedure
+```
+model_id = 'bert-base-uncased'
+arguments = TrainingArguments(
+    evaluation_strategy="epoch",
+    learning_rate=2e-5,
+    per_device_train_batch_size=8,
+    per_device_eval_batch_size=8,
+    num_train_epochs=5,
+    weight_decay=0.01,
+)
+...
+trainer.train()
+```