Model Description
indoor-geoai is a deep learning model specialized for the geolocalisation of residential indoor images.
Technical Architecture:
- Base Model: [DeiT-384]
- Fine-tuning: The model was fine-tuned specifically for Deep Hashing on indoor scenes. It learns to map high-dimensional visual features into compact binary hash codes. Crucially, the training objective forces the model to converge on distinct vector representations for each country class, ensuring that images from the same country cluster tightly together in the Hamming space while remaining separable from other regions.
This is a retrieval-based system. It predicts location by finding the nearest visual neighbors in a reference database based on the learned hash codes.
- Model Type: Deep Hashing Network (DeiT-based)
- Task: Indoor Image Geolocation / Image Retrieval
- License: Apache-2.0
Supported Countries
While the associated paper evaluates the methodology on a strict subset of 6 countries, this specific model release comes pre-configured to classify images from 14 distinct countries across three continents:
| Continent | Supported Countries |
|---|---|
| Europe | Germany, Poland, Norway, France, Hungary |
| Asia | Pakistan, Kazakhstan, Japan, South Korea |
| Latin America | Bolivia, Chile, Argentina, Colombia, Peru |
Training Data
The model was trained on a proprietary dataset consisting exclusively of residential indoor images sourced from the 14 countries-classes listed above.
Performance
The model was evaluated on a held-out test set (separate from the training data) containing residential images from the same 14 countries.
- Accuracy: Approximately 63% (Top-1 retrieval accuracy without post-filtering).
Citation
If you use this code or model in your research, please cite our IEEE Access paper:
Paper Link: Multi-Source Visual Language Model Fusion for Indoor Geolocation Reliability
@ARTICLE{Konstantinou2026MultiSource,
author={Konstantinou, Nikolaos and Semertzidis, Theodoros and Daras, Petros},
journal={IEEE Access},
title={Multi-Source Visual Language Model Fusion for Indoor Geolocation Reliability},
year={2026},
volume={14},
pages={13202-13217},
doi={10.1109/ACCESS.2026.3656619}
}