JashuXo
/

SmartKNN_v2

nearest-neighbors

distance-weighted

production-ready

Model card Files Files and versions

SmartKNN_v2 / README.md

JashuXo's picture

Update README.md

dd8886e verified 25 days ago

|

history blame contribute delete

2.72 kB

	---
	license: mit
	language:
	- en
	metrics:
	- accuracy
	- f1
	- r_squared
	- mse
	tags:
	- knn
	- nearest-neighbors
	- tabular
	- classification
	- regression
	- cpu
	- low-latency
	- ann
	- distance-weighted
	- production-ready
	---
	---
	language:
	- en
	tags:
	- tabular
	- nearest-neighbors
	- knn
	- classification
	- regression
	- cpu
	- low-latency
	- interpretable
	library_name: smart-knn
	license: mit
	pipeline_tag: tabular-classification
	model_name: SmartKNN v2
	---

	# SmartKNN v2

	SmartKNN v2 is a high-performance, CPU-first nearest-neighbors model designed for low-latency production inference on real-world tabular data.

	It delivers competitive accuracy with gradient-boosted models while maintaining sub-millisecond single-prediction latency (p95) on CPU-only systems.

	SmartKNN v2 is part of the SmartEco ecosystem.

	---

	## Model Details

	- Model type: Distance-weighted K-Nearest Neighbors
	- Tasks: Classification, Regression
	- Backend: Adaptive (Brute-force + ANN)
	- Hardware: CPU-only (GPU not required)
	- Focus: Low latency, interpretability, production readiness

	Unlike classical KNN, SmartKNN v2 learns feature importance, adapts execution strategy based on data size, and uses optimized distance kernels for fast inference.

	---

	## What’s New in v2

	- Full classification support restored
	- ANN backend introduced for scalable neighbor search
	- Automatic backend selection (small → brute, large → ANN)
	- Distance-weighted voting for improved accuracy
	- Interpretable neighbor influence statistics
	- Foundation for adaptive-K strategies

	---

	## Architecture Overview

	- Feature Weighting
	- Backend Selector
	- Brute Backend (small datasets)
	- ANN Backend (large datasets)
	- Distance Kernel
	- Weighted Voting
	- Prediction


	This hybrid architecture ensures consistent low latency across dataset sizes.

	---

	## Performance (Internal Evaluation)

	> Public benchmarks will be released soon.

	From internal testing on real-world tabular datasets:

	- Accuracy comparable to XGBoost / LightGBM / CatBoost
	- Single-prediction latency:
	- Median: sub-millisecond
	- p95: consistently low on CPU
	- Predictable batch inference scaling

	SmartKNN v2 has not yet reached its performance ceiling. Future releases will further optimize speed and accuracy.

	---

	## Limitations

	- Not designed for unstructured data (text, images)
	- ANN backend focuses on CPU efficiency, not GPU acceleration
	- Best suited for tabular datasets

	---

	## Future Work

	- Adaptive-K accuracy optimization
	- Kernel-level speed improvements
	- Custom ANN backend

	## Links

	- Website: https://thatipamula-jashwanth.github.io/SmartEco/
	- Source Code: https://github.com/thatipamula-jashwanth/smart-knn