wttw
/

modchelin_thainer-base-model

Token Classification

Model card Files Files and versions

modchelin_thainer-base-model / README.md

wttw's picture

Update README.md

48c2673 verified 8 months ago

|

history blame contribute delete

3.25 kB

	---
	license: cc-by-4.0
	language:
	- th
	base_model:
	- airesearch/wangchanberta-base-att-spm-uncased
	pipeline_tag: token-classification
	---
	library_name: transformers
	tags: [ner, thai, food, review, token-classification]
	---

	# Model Card for wttw/modchelin_thainer-base-model

	This model performs Named Entity Recognition (NER) on Thai-language food reviews. It is designed to extract domain-specific aspects such as dish names, ingredients, restaurant service, and sentiment-related phrases from customer-written content.

	## Model Details

	### Model Description

	This is the model card of a 🤗 Transformers model that has been pushed to the Hugging Face Hub.

	- Developed by: Vitawat Kitipatthavorn
	- Finetuned from model: `airesearch/wangchanberta-base-att-spm-uncased`
	- Model type: Token Classification (NER)
	- Language(s) (NLP): Thai
	- License: cc-by-sa-4.0
	- Shared by: wttw
	- Model ID: `wttw/modchelin_thainer-base-model`

	## Uses

	### Direct Use

	This model is designed for extracting domain-specific entities from Thai-language food reviews. It identifies and classifies named entities related to:

	- Food/menu items
	- Taste
	- Service
	- Ambiance
	- Price and value
	- Other aspects relevant to customer dining experiences

	Example:

	- Input: `"ต้มยำกุ้งอร่อยมาก แต่บริการช้า"`
	- Output:
	- `ต้มยำกุ้ง: FOOD`
	- `บริการ: SERVICE`

	The model is suitable for NLP pipelines aimed at analyzing restaurant reviews, powering sentiment dashboards, or supporting aspect-based sentiment analysis (ABSA).

	### Downstream Use

	The model can be integrated into:

	- Thai ABSA pipelines
	- Restaurant feedback summarization systems
	- Chatbots or moderation tools for food delivery and review platforms

	### Out-of-Scope Use

	The model is not designed for:

	- Non-food-related documents (e.g., legal, clinical, political)
	- General-purpose Thai NER tasks
	- Use cases requiring high confidence on ambiguous or out-of-domain text

	## Bias, Risks, and Limitations

	The model is trained specifically on food review content and may:

	- Struggle with informal slang or regional dialects
	- Over-predict `FOOD` entities in unrelated contexts
	- Misclassify ambiguous phrases without surrounding context

	### Recommendations

	Users should:

	- Avoid applying this model outside food-related domains
	- Fine-tune further if working with reviews in specific dialects or contexts
	- Evaluate on a sample of target data before production use
	- Consider setting confidence thresholds before using predictions downstream

	## How to Get Started with the Model

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	from transformers import pipeline

	model_name = "wttw/modchelin_thainer-base-model"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForTokenClassification.from_pretrained(model_name)

	ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

	example = "ต้มยำกุ้งอร่อยมาก แต่บริการช้า"
	entities = ner_pipeline(example)

	print(entities)