--- license: mit language: - en base_model: - dslim/distilbert-NER pipeline_tag: token-classification tags: - transport - bus --- # 🚍 MyBusModel: A Custom NER Model for Public Transport Queries BusRouteNER is a lightweight, rule-enhanced Named Entity Recognition (NER) model fine-tuned for identifying **bus numbers** and **stops/locations** in natural language queries related to public transportation in West Bengal, India. --- ## ✨ What does this model do? This model is trained to extract two key entity types from user queries: - `BUS_NUMBER`: Recognizes bus numbers like `12C/1`, `S-12`, `12B`, etc. - `LOCATION`: Identifies source and destination locations such as `Howrah`, `Barrackpore`, `Santragachi`, etc. It also filters out irrelevant **noise words** to give a clean and accurate entity list that can be used in downstream logic such as search, recommendations, or route-finding. --- ## 🔍 Example **Input Query:** `I want to go from Santragachi to Barrackpore, can I take 12C/1 or S-12?` **Model Output:** `Santragachi LOCATION` `Barrackpore LOCATION` `12C/1 BUS_NUMBER` `S-12 BUS_NUMBER` --- ## 🧠 How it works - Built using **spaCy** (`en_core_web_sm`) and extended with `EntityRuler` for custom NER logic. - Bus numbers and stop names are sourced from curated CSV datasets. - Custom regex patterns identify bus numbers with formats like `12C/1`, `S-12`, etc. - Noise words like *I, want, take, can, should* are excluded from final entity extraction. --- ## 🛠 Results - Precision: 0.8078439964943033 - Recall: 0.6660043352601156 - F1 Score: 0.7300990099009901