--- datasets: - mit_restaurant_search_ner language: en license: mit model_name: restaurant-ner tags: - token-classification - ner - restaurant-search pipeline_tag: token-classification library_name: transformers widget: - example_title: "Italian Restaurant Example" text: "I want a 5 star Italian restaurant in Boston with outdoor seating" - example_title: "Thai Food Example" text: "Looking for cheap Thai food near downtown with parking" --- # Model Card for Restaurant NER This model performs Named Entity Recognition (NER) on restaurant search queries, extracting structured information like ratings, locations, cuisines, and amenities from natural language queries. ## Model Details ### Model Description This model performs Named Entity Recognition (NER) on restaurant search queries. It can identify entities such as Rating, Location, Cuisine, Amenity, and more. The model helps understand user intent in restaurant search queries by structuring the free-form text into actionable parameters. The model was fine-tuned using DistilBERT on the MIT Restaurant Search NER dataset, following the BIO (Beginning, Inside, Outside) tagging scheme for entity detection. - **Developed by:** Your Name - **Model type:** Token Classification (NER) - **Language(s):** English - **License:** MIT - **Finetuned from model:** distilbert-base-uncased ### Model Sources - **Dataset:** MIT Restaurant Search NER dataset ## Uses ### Direct Use This model can be used directly to extract structured information from restaurant search queries. It's particularly useful for: ```python from transformers import pipeline nlp = pipeline('token-classification', model='niruthiha/restaurant-ner', aggregation_strategy='simple') result = nlp("I want a 5 star Italian restaurant in Boston with outdoor seating") print(result) ``` Expected output would identify entities like: - "5 star" as Rating - "Italian" as Cuisine - "Boston" as Location - "outdoor seating" as Amenity ### Downstream Use This NER model can be integrated into: - Restaurant search applications - Food delivery platforms - Conversational AI assistants for restaurant recommendations - Restaurant review analysis systems ### Out-of-Scope Use This model is not designed for: - General purpose entity recognition outside of the restaurant domain - Sentiment analysis of restaurant reviews - Understanding complex nutritional requests - Processing non-English queries ## Bias, Risks, and Limitations - The model is trained on the MIT Restaurant Search NER dataset and may not generalize well to significantly different restaurant query styles or dialects - Performance may vary for cuisine types, locations, or amenities not well-represented in the training data - The model may struggle with spelling errors or highly colloquial language ### Recommendations - Use in conjunction with spelling correction for better user experience - Consider retraining or fine-tuning on domain-specific data if using for specialized restaurant types - Monitor performance across different demographic groups to ensure equitable performance ## How to Get Started with the Model ```python from transformers import pipeline, AutoTokenizer, AutoModelForTokenClassification # Load the model and tokenizer tokenizer = AutoTokenizer.from_pretrained("niruthiha/restaurant-ner") model = AutoModelForTokenClassification.from_pretrained("niruthiha/restaurant-ner") # Or use pipeline for easy inference ner = pipeline('token-classification', model="niruthiha/restaurant-ner", aggregation_strategy='simple') # Example usage query = "Looking for cheap Thai food near downtown with parking" entities = ner(query) print(entities) ``` ## Training Details ### Training Data The model was trained on the MIT Restaurant Search NER dataset, which contains annotated restaurant search queries with the following entity types: - Rating - Location - Cuisine - Amenity - Restaurant_Name - Hours - Price - And others ### Training Procedure #### Preprocessing 1. The data was loaded in BIO format 2. Tags were converted to numeric IDs 3. DistilBERT tokenizer was used to tokenize the input 4. Special handling was implemented to align BIO tags with subword tokens #### Training Hyperparameters - **Learning rate:** 2e-5 - **Batch size:** 16 - **Training epochs:** 3 - **Weight decay:** 0.01 - **Training regime:** fp32 ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The model was evaluated on the test split of the MIT Restaurant Search NER dataset. #### Metrics ![image/png](https://cdn-uploads.huggingface.co/production/uploads/663cef3051664a5bcdf4a23f/G4KjMa2NgKvVcy-tG3aw1.png) These metrics were calculated using the seqeval library, which evaluates NER performance at the entity level rather than token level. ## Technical Specifications ### Model Architecture The model uses DistilBERT with a token classification head. DistilBERT is a smaller, faster version of BERT that retains much of the performance while being more efficient. #### Software - Hugging Face Transformers - PyTorch - Python 3.x ## Model Card Contact https://niruthiha.github.io/