--- license: mit language: - en metrics: - accuracy - f1 - r_squared - mse tags: - knn - nearest-neighbors - tabular - classification - regression - cpu - low-latency - ann - distance-weighted - production-ready --- --- language: - en tags: - tabular - nearest-neighbors - knn - classification - regression - cpu - low-latency - interpretable library_name: smart-knn license: mit pipeline_tag: tabular-classification model_name: SmartKNN v2 --- # SmartKNN v2 **SmartKNN v2** is a high-performance, CPU-first nearest-neighbors model designed for **low-latency production inference** on real-world tabular data. It delivers **competitive accuracy with gradient-boosted models** while maintaining **sub-millisecond single-prediction latency (p95)** on CPU-only systems. SmartKNN v2 is part of the **SmartEco** ecosystem. --- ## Model Details - **Model type:** Distance-weighted K-Nearest Neighbors - **Tasks:** Classification, Regression - **Backend:** Adaptive (Brute-force + ANN) - **Hardware:** CPU-only (GPU not required) - **Focus:** Low latency, interpretability, production readiness Unlike classical KNN, SmartKNN v2 learns feature importance, adapts execution strategy based on data size, and uses optimized distance kernels for fast inference. --- ## What’s New in v2 - Full classification support restored - ANN backend introduced for scalable neighbor search - Automatic backend selection (small → brute, large → ANN) - Distance-weighted voting for improved accuracy - Interpretable neighbor influence statistics - Foundation for adaptive-K strategies --- ## Architecture Overview - Feature Weighting - Backend Selector - Brute Backend (small datasets) - ANN Backend (large datasets) - Distance Kernel - Weighted Voting - Prediction This hybrid architecture ensures consistent low latency across dataset sizes. --- ## Performance (Internal Evaluation) > Public benchmarks will be released soon. From internal testing on real-world tabular datasets: - Accuracy comparable to XGBoost / LightGBM / CatBoost - Single-prediction latency: - Median: sub-millisecond - p95: consistently low on CPU - Predictable batch inference scaling SmartKNN v2 has **not yet reached its performance ceiling**. Future releases will further optimize speed and accuracy. --- ## Limitations - Not designed for unstructured data (text, images) - ANN backend focuses on CPU efficiency, not GPU acceleration - Best suited for tabular datasets --- ## Future Work - Adaptive-K accuracy optimization - Kernel-level speed improvements - Custom ANN backend ## Links - Website: https://thatipamula-jashwanth.github.io/SmartEco/ - Source Code: https://github.com/thatipamula-jashwanth/smart-knn