--- license: mit language: - en - hi tags: - ner - address-parsing - indian-addresses - bert - crf datasets: - custom metrics: - f1 - precision - recall model-index: - name: indian-address-parser-model results: - task: type: token-classification name: Named Entity Recognition metrics: - type: f1 value: 0.80 name: F1 (micro) - type: precision value: 0.79 name: Precision (micro) - type: recall value: 0.81 name: Recall (micro) --- # Indian Address Parser Model A fine-tuned **IndicBERTv2-SS + CRF** model for parsing unstructured Indian addresses into structured components. ## Model Description - **Base Model**: [ai4bharat/IndicBERTv2-SS](https://huggingface.co/ai4bharat/IndicBERTv2-SS) - **Architecture**: BERT + Conditional Random Field (CRF) layer - **Languages**: English, Hindi (Latin and Devanagari scripts) - **Training Data**: 600+ annotated Delhi addresses ## Performance | Entity Type | Precision | Recall | F1-Score | |---------------|-----------|--------|----------| | AREA | 0.87 | 0.87 | 0.87 | | CITY | 1.00 | 1.00 | 1.00 | | FLOOR | 0.85 | 0.85 | 0.85 | | GALI | 0.75 | 0.67 | 0.71 | | HOUSE_NUMBER | 0.79 | 0.79 | 0.79 | | KHASRA | 0.75 | 0.82 | 0.78 | | PINCODE | 1.00 | 1.00 | 1.00 | | **Overall** | **0.79** | **0.81**| **0.80** | ## Supported Entity Types - `HOUSE_NUMBER` - House/Plot/Flat numbers - `FLOOR` - Floor indicators (Ground, First, etc.) - `BLOCK` - Block identifiers - `SECTOR` - Sector numbers - `GALI` - Gali (lane) numbers - `COLONY` - Colony/Society names - `AREA` - Area/Locality names - `SUBAREA` - Sub-area names - `KHASRA` - Khasra (land record) numbers - `PINCODE` - 6-digit postal codes - `CITY` - City names - `STATE` - State names ## Usage ```python from address_parser import AddressParser # Load model parser = AddressParser.from_pretrained("YOUR_USERNAME/indian-address-parser-model") # Parse address result = parser.parse("PLOT NO752 FIRST FLOOR, BLOCK H-3, NEW DELHI, 110041") # Access structured output print(result.house_number) # "PLOT NO752" print(result.floor) # "FIRST FLOOR" print(result.city) # "NEW DELHI" print(result.pincode) # "110041" ``` ## Demo Try the live demo: [HuggingFace Space](https://huggingface.co/spaces/YOUR_USERNAME/indian-address-parser) ## License MIT License