panigrah
/

wineberto-labels

Token Classification

Model card Files Files and versions

wineberto-labels / README.md

panigrah's picture

Update README.md

d012f12 about 2 years ago

|

history blame contribute delete

1.43 kB

	---
	license: unknown
	language:
	- en
	tags:
	- wine
	- ner
	widget:
	- text: 'Heitz Cabernet Sauvignon California Napa Valley Napa US'
	example_title: 'California Cab'

	---

	# Wineberto labels

	Pretrained model on on wine labels only for named entity recognition that uses bert-base-uncased as the base model.

	## Model description


	## How to use

	You can use this model directly for named entity recognition like so

	```python
	>>> from transformers import pipeline
	>>> ner = pipeline('ner', model='winberto-labels')
	>>> tokens = ner("Heitz Cabernet Sauvignon California Napa Valley Napa US")
	>>> for t in toks:
	>>> print(f"{t['word']}: {t['entity_group']}: {t['score']:.5}")

	heitz: producer: 0.99758
	cabernet: wine: 0.92263
	sauvignon: wine: 0.92472
	california: region: 0.53502
	napa valley: subregion: 0.79638
	us: country: 0.93675
	```

	## Training data

	The BERT model was trained on 50K wine labels derived from https://www.liv-ex.com/wwd/lwin/ and manually annotated to capture the following tokens

	```
	"1": "B-classification",
	"2": "B-country",
	"3": "B-producer",
	"4": "B-region",
	"5": "B-subregion",
	"6": "B-vintage",
	"7": "B-wine"
	```

	## Training procedure
	```
	model_id = 'bert-base-uncased'
	arguments = TrainingArguments(
	evaluation_strategy="epoch",
	learning_rate=2e-5,
	per_device_train_batch_size=8,
	per_device_eval_batch_size=8,
	num_train_epochs=5,
	weight_decay=0.01,
	)
	...
	trainer.train()
	```