--- license: cc-by-4.0 language: - th base_model: - airesearch/wangchanberta-base-att-spm-uncased pipeline_tag: token-classification --- library_name: transformers tags: [ner, thai, food, review, token-classification] --- # Model Card for wttw/modchelin_thainer-base-model This model performs Named Entity Recognition (NER) on Thai-language food reviews. It is designed to extract domain-specific aspects such as dish names, ingredients, restaurant service, and sentiment-related phrases from customer-written content. ## Model Details ### Model Description This is the model card of a 🤗 Transformers model that has been pushed to the Hugging Face Hub. - **Developed by:** Vitawat Kitipatthavorn - **Finetuned from model:** `airesearch/wangchanberta-base-att-spm-uncased` - **Model type:** Token Classification (NER) - **Language(s) (NLP):** Thai - **License:** cc-by-sa-4.0 - **Shared by:** wttw - **Model ID:** `wttw/modchelin_thainer-base-model` ## Uses ### Direct Use This model is designed for extracting domain-specific entities from Thai-language food reviews. It identifies and classifies named entities related to: - Food/menu items - Taste - Service - Ambiance - Price and value - Other aspects relevant to customer dining experiences **Example:** - **Input:** `"ต้มยำกุ้งอร่อยมาก แต่บริการช้า"` - **Output:** - `ต้มยำกุ้ง: FOOD` - `บริการ: SERVICE` The model is suitable for NLP pipelines aimed at analyzing restaurant reviews, powering sentiment dashboards, or supporting aspect-based sentiment analysis (ABSA). ### Downstream Use The model can be integrated into: - Thai ABSA pipelines - Restaurant feedback summarization systems - Chatbots or moderation tools for food delivery and review platforms ### Out-of-Scope Use The model is not designed for: - Non-food-related documents (e.g., legal, clinical, political) - General-purpose Thai NER tasks - Use cases requiring high confidence on ambiguous or out-of-domain text ## Bias, Risks, and Limitations The model is trained specifically on food review content and may: - Struggle with informal slang or regional dialects - Over-predict `FOOD` entities in unrelated contexts - Misclassify ambiguous phrases without surrounding context ### Recommendations Users should: - Avoid applying this model outside food-related domains - Fine-tune further if working with reviews in specific dialects or contexts - Evaluate on a sample of target data before production use - Consider setting confidence thresholds before using predictions downstream ## How to Get Started with the Model ```python from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline model_name = "wttw/modchelin_thainer-base-model" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForTokenClassification.from_pretrained(model_name) ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple") example = "ต้มยำกุ้งอร่อยมาก แต่บริการช้า" entities = ner_pipeline(example) print(entities)