File size: 2,719 Bytes
90d009a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
license: mit
language:
- en
metrics:
- accuracy
- f1
- r_squared
- mse
tags:
- knn
- nearest-neighbors
- tabular
- classification
- regression
- cpu
- low-latency
- ann
- distance-weighted
- production-ready
---
---
language:
- en
tags:
- tabular
- nearest-neighbors
- knn
- classification
- regression
- cpu
- low-latency
- interpretable
library_name: smart-knn
license: mit
pipeline_tag: tabular-classification
model_name: SmartKNN v2
---

#  SmartKNN v2

**SmartKNN v2** is a high-performance, CPU-first nearest-neighbors model designed for **low-latency production inference** on real-world tabular data.

It delivers **competitive accuracy with gradient-boosted models** while maintaining **sub-millisecond single-prediction latency (p95)** on CPU-only systems.

SmartKNN v2 is part of the **SmartEco** ecosystem.

---

## Model Details

- **Model type:** Distance-weighted K-Nearest Neighbors  
- **Tasks:** Classification, Regression  
- **Backend:** Adaptive (Brute-force + ANN)  
- **Hardware:** CPU-only (GPU not required)  
- **Focus:** Low latency, interpretability, production readiness  

Unlike classical KNN, SmartKNN v2 learns feature importance, adapts execution strategy based on data size, and uses optimized distance kernels for fast inference.

---

## What’s New in v2

- Full classification support restored
- ANN backend introduced for scalable neighbor search
- Automatic backend selection (small → brute, large → ANN)
- Distance-weighted voting for improved accuracy
- Interpretable neighbor influence statistics
- Foundation for adaptive-K strategies

---

## Architecture Overview

- Feature Weighting
- Backend Selector
- Brute Backend (small datasets)
- ANN Backend (large datasets)
- Distance Kernel
- Weighted Voting
- Prediction


This hybrid architecture ensures consistent low latency across dataset sizes.

---

## Performance (Internal Evaluation)

> Public benchmarks will be released soon.

From internal testing on real-world tabular datasets:

- Accuracy comparable to XGBoost / LightGBM / CatBoost
- Single-prediction latency:
  - Median: sub-millisecond
  - p95: consistently low on CPU
- Predictable batch inference scaling

SmartKNN v2 has **not yet reached its performance ceiling**. Future releases will further optimize speed and accuracy.

---

## Limitations

- Not designed for unstructured data (text, images)
- ANN backend focuses on CPU efficiency, not GPU acceleration
- Best suited for tabular datasets

---

## Future Work

- Adaptive-K accuracy optimization
- Kernel-level speed improvements
- Custom ANN backend

## Links

- Website: https://thatipamula-jashwanth.github.io/SmartEco/
- Source Code: https://github.com/thatipamula-jashwanth/smart-knn