mybusbot / README.md
Alapan's picture
Update README.md
0beff2f verified
---
license: mit
language:
- en
base_model:
- dslim/distilbert-NER
pipeline_tag: token-classification
tags:
- transport
- bus
---
# 🚍 MyBusModel: A Custom NER Model for Public Transport Queries
BusRouteNER is a lightweight, rule-enhanced Named Entity Recognition (NER) model fine-tuned for identifying **bus numbers** and **stops/locations** in natural language queries related to public transportation in West Bengal, India.
---
## ✨ What does this model do?
This model is trained to extract two key entity types from user queries:
- `BUS_NUMBER`: Recognizes bus numbers like `12C/1`, `S-12`, `12B`, etc.
- `LOCATION`: Identifies source and destination locations such as `Howrah`, `Barrackpore`, `Santragachi`, etc.
It also filters out irrelevant **noise words** to give a clean and accurate entity list that can be used in downstream logic such as search, recommendations, or route-finding.
---
## πŸ” Example
**Input Query:**
`I want to go from Santragachi to Barrackpore, can I take 12C/1 or S-12?`
**Model Output:**
`Santragachi LOCATION`
`Barrackpore LOCATION`
`12C/1 BUS_NUMBER`
`S-12 BUS_NUMBER`
---
## 🧠 How it works
- Built using **spaCy** (`en_core_web_sm`) and extended with `EntityRuler` for custom NER logic.
- Bus numbers and stop names are sourced from curated CSV datasets.
- Custom regex patterns identify bus numbers with formats like `12C/1`, `S-12`, etc.
- Noise words like *I, want, take, can, should* are excluded from final entity extraction.
---
## πŸ›  Results
- Precision: 0.8078439964943033
- Recall: 0.6660043352601156
- F1 Score: 0.7300990099009901