README.md · reddysama/kestrelnet-benchmarks at main

File size: 12,196 Bytes

844b533

---
license: apache-2.0
library_name: numpy
tags:
  - tabular-classification
  - tiny-model
  - edge-ai
  - no-gpu
  - numpy
  - real-time
  - ecg
  - eeg
  - seizure-detection
  - activity-recognition
  - medical-ai
  - biosignal
  - analytic-gradients
datasets:
  - shayanfazeli/heartbeat
  - birdy654/eeg-brainwave-dataset-feeling-emotions
  - robikscube/eye-state-classification-eeg-dataset
  - harunshimanto/epileptic-seizure-recognition
  - uciml/human-activity-recognition-with-smartphones
metrics:
  - accuracy
  - f1
  - roc_auc
model-index:
  - name: KestrelNet / GoshawkNet Benchmark Suite
    results:
      - task:
          type: tabular-classification
          name: ECG Arrhythmia Detection
        dataset:
          type: shayanfazeli/heartbeat
          name: MIT-BIH Arrhythmia
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.972
          - name: Macro F1
            type: f1
            value: 0.853
      - task:
          type: tabular-classification
          name: EEG Emotion Recognition
        dataset:
          type: birdy654/eeg-brainwave-dataset-feeling-emotions
          name: EEG Brainwave Emotions
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.991
          - name: Macro F1
            type: f1
            value: 0.991
      - task:
          type: tabular-classification
          name: EEG Eye State Detection
        dataset:
          type: robikscube/eye-state-classification-eeg-dataset
          name: EEG Eye State (UCI)
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.942
          - name: AUC-ROC
            type: roc_auc
            value: 0.986
      - task:
          type: tabular-classification
          name: Epileptic Seizure Detection
        dataset:
          type: harunshimanto/epileptic-seizure-recognition
          name: Bonn University EEG
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.971
          - name: AUC-ROC
            type: roc_auc
            value: 0.988
      - task:
          type: tabular-classification
          name: Human Activity Recognition
        dataset:
          type: uciml/human-activity-recognition-with-smartphones
          name: UCI HAR Smartphones
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.949
          - name: Macro F1
            type: f1
            value: 0.949
pipeline_tag: tabular-classification
---

# KestrelNet / GoshawkNet — Benchmark Suite

**Here's what a tiny model can do.**

Five public datasets. Five domains. All under 164K parameters. All CPU-only. All pure NumPy — no PyTorch, no TensorFlow, no GPU. Every result verified on Kaggle with live scoring.

## Results

| Dataset | Domain | Task | Accuracy | F1 / AUC | Params | Size | Latency |
|---|---|---|---|---|---|---|---|
| [MIT-BIH Arrhythmia](https://kaggle.com/datasets/shayanfazeli/heartbeat) | Cardiology | 5-class ECG | **97.2%** | F1 0.853 | 12,756 | 50 KB | 56 μs |
| [EEG Brainwave Emotions](https://kaggle.com/datasets/birdy654/eeg-brainwave-dataset-feeling-emotions) | Neuroscience | 3-class EEG | **99.1%** | F1 0.991 | 163,788 | 640 KB | 1.3 ms |
| [EEG Eye State](https://kaggle.com/datasets/robikscube/eye-state-classification-eeg-dataset) | Neuroscience | Binary EEG | **94.2%** | AUC 0.986 | 1,576 | 6 KB | 17 μs |
| [Epileptic Seizure](https://kaggle.com/datasets/harunshimanto/epileptic-seizure-recognition) | Neurology | Binary EEG | **97.1%** | AUC 0.988 | 12,072 | 47 KB | — |
| [HAR Smartphones](https://kaggle.com/datasets/uciml/human-activity-recognition-with-smartphones) | Wearables | 6-class IMU | **94.9%** | F1 0.949 | 15,416 | 60 KB | 70 μs |

Total model storage for all five: **803 KB**.

For context, a single layer of BERT is 7 million parameters. Our five models combined have 205,608.

## How Small Is Small?

| Dataset | Typical CNN/LSTM | Ours | How much smaller |
|---|---|---|---|
| ECG Heartbeat | 500K – 2M params | 12,756 | **40–160x** |
| EEG Emotions | 1M+ params | 163,788 | **6x** |
| EEG Eye State | 100K+ params | 1,576 | **63x** |
| Seizure Detection | 200K+ params | 12,072 | **17x** |
| HAR Smartphones | 200K – 1M params | 15,416 | **13–65x** |

## Two Model Families

We ship two architectures, named after raptors — bird size matches model size, hunting style matches classification style.

### KestrelNet (Standard FC)

The kestrel is the smallest falcon. It hovers perfectly still, then strikes with precision. KestrelNet is a standard fully-connected network with ReLU activations. Minimal parameters, maximum accuracy.

```
Input → Dense(hidden₁, ReLU) → Dense(hidden₂, ReLU) → Dense(classes, Softmax)
```

### GoshawkNet (Multivector Products)

The goshawk is a larger raptor that hunts in complex terrain, reading patterns others miss. GoshawkNet replaces standard dot products with multivector products, giving each neuron native access to rotations, reflections, and scaling in a single operation. More parameters, but captures geometric structure in the data that FC nets need many more layers to approximate.

**Best model per dataset:**

| Dataset | Best Model | Architecture |
|---|---|---|
| ECG Heartbeat | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| EEG Emotions | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| EEG Eye State | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| Seizure Detection | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| HAR Smartphones | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |

Quaternion algebra (Cl(0,2), dimension 4) consistently wins across all five domains.

## Per-Dataset Details

### ECG Heartbeat — MIT-BIH Arrhythmia Database

- **Samples**: 87,554 train / 21,892 test
- **Features**: 187 time-series values per heartbeat
- **Classes**: Normal (N), Supraventricular (S), Ventricular (V), Fusion (F), Unknown (Q)
- **Best model**: GoshawkNet Cl(0,2) [16,8] — 97.2% accuracy, 12,756 params
- **Kaggle notebook**: [samareddy94/gnaninet-ecg-benchmark](https://www.kaggle.com/code/samareddy94/gnaninet-ecg-benchmark)

| Class | Accuracy |
|---|---|
| Normal (N) | 99.2% |
| Supraventricular (S) | 64.6% |
| Ventricular (V) | 90.9% |
| Fusion (F) | 63.0% |
| Unknown (Q) | 95.9% |

### EEG Brainwave Emotions

- **Samples**: 2,132 (1,707 train / 425 test)
- **Features**: 2,548 EEG features (channel means + FFT)
- **Classes**: Negative, Neutral, Positive
- **Best model**: GoshawkNet Cl(0,2) [16,8] — 99.1% accuracy, 163,788 params
- **Kaggle notebook**: [samareddy94/99-eeg-emotion-detection-164k-params-no-gpu](https://www.kaggle.com/code/samareddy94/99-eeg-emotion-detection-164k-params-no-gpu)

| Class | Accuracy |
|---|---|
| Negative | 99.3% |
| Neutral | 100.0% |
| Positive | 97.9% |

### EEG Eye State — UCI / Roesler

- **Samples**: 14,980 (11,985 train / 2,995 test)
- **Features**: 14 EEG channels (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4)
- **Classes**: Eyes Open, Eyes Closed
- **Best model**: GoshawkNet Cl(0,2) [16,8] — 94.2% accuracy, 1,576 params
- **Kaggle notebook**: [samareddy94/gnaninet-eeg-eyestate-benchmark](https://www.kaggle.com/code/samareddy94/gnaninet-eeg-eyestate-benchmark)

The smallest model in the suite: **1,576 parameters, 6 KB**. Runs at 60,000 inferences/sec on CPU.

### Epileptic Seizure Recognition — Bonn University

- **Samples**: 11,500 (9,200 train / 2,300 test)
- **Features**: 178 EEG time-series values
- **Classes**: Seizure vs Non-seizure (binary)
- **Best model**: GoshawkNet Cl(0,2) [16,8] — 97.1% accuracy, AUC 0.988, 12,072 params

AUC of 0.988 means the model correctly ranks seizure vs non-seizure 98.8% of the time — critical for clinical screening.

### HAR Smartphones — UCI Activity Recognition

- **Samples**: 7,352 train / 2,947 test (official split)
- **Features**: 228 triaxial accelerometer + gyroscope features
- **Classes**: Walking, Walking Upstairs, Walking Downstairs, Sitting, Standing, Laying
- **Best model**: GoshawkNet Cl(0,2) [16,8] — 95.7% local / 94.9% Kaggle live, 15,416 params
- **Kaggle notebook**: [samareddy94/gnaninet-har-benchmark](https://www.kaggle.com/code/samareddy94/gnaninet-har-benchmark)

| Class | Accuracy |
|---|---|
| Walking | 99.0% |
| Walking Upstairs | 90.7% |
| Walking Downstairs | 96.4% |
| Sitting | 91.9% |
| Standing | 95.7% |
| Laying | 99.8% |

## Training Details

All models trained with the same configuration:

- **Optimizer**: Adam (lr=0.001, β₁=0.9, β₂=0.999)
- **LR Schedule**: Warmup-cosine (10-epoch warmup)
- **Early stopping**: Patience 30–40 on validation loss
- **Batch size**: 64–128
- **L2 regularization**: λ = 1e-4 to 1e-5
- **Gradient clipping**: 5.0
- **Normalization**: Z-score, fit on training set only
- **Backpropagation**: Analytic (hand-derived gradients, no autograd)

Training is fast — all five models train in under 10 minutes total on a laptop CPU.

## Repository Structure

```
├── ecg-heartbeat/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 97.2% accuracy
│   └── results.json       # Full benchmark comparison (4 models)
├── eeg-emotions/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 99.1% accuracy
│   └── results.json
├── eye-state/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 94.2% accuracy
│   └── results.json
├── seizure-prediction/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 97.1% accuracy
│   └── results.json
├── har-smartphones/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 94.9% accuracy
│   └── results.json
└── inference.py           # Self-contained inference loader (no dependencies beyond NumPy)
```

## Quick Start

```python
import numpy as np
from inference import load_model

# Load any model
model = load_model("ecg-heartbeat")
proba = model.predict_proba(np.random.randn(187))
print(proba)  # [0.92, 0.01, 0.05, 0.01, 0.01] — 5-class probabilities
```

## Intended Use

- **Clinical screening**: Pre-filter for ECG/EEG analysis before specialist review
- **Edge deployment**: Wearables, IoT sensors, embedded devices — no GPU, no cloud
- **Ensemble first stage**: Fast, tiny model screens easy cases; complex model handles the rest
- **Research baseline**: Reproducible benchmarks on public datasets with minimal compute
- **Education**: Complete from-scratch neural network with analytic gradients

## Limitations

- Models are trained on tabular/flattened features, not raw waveforms
- Per-class accuracy varies — rare classes (ECG Fusion, ECG Supraventricular) have lower recall
- No sequence modeling — each sample is classified independently
- Medical models are NOT validated for clinical use — research benchmarks only

## Kaggle Verification

All results except seizure prediction have been verified with live Kaggle notebook scoring:

| Dataset | Kaggle Notebook |
|---|---|
| ECG Heartbeat | [samareddy94/gnaninet-ecg-benchmark](https://www.kaggle.com/code/samareddy94/gnaninet-ecg-benchmark) |
| EEG Emotions | [samareddy94/99-eeg-emotion-detection-164k-params-no-gpu](https://www.kaggle.com/code/samareddy94/99-eeg-emotion-detection-164k-params-no-gpu) |
| EEG Eye State | [samareddy94/gnaninet-eeg-eyestate-benchmark](https://www.kaggle.com/code/samareddy94/gnaninet-eeg-eyestate-benchmark) |
| HAR Smartphones | [samareddy94/gnaninet-har-benchmark](https://www.kaggle.com/code/samareddy94/gnaninet-har-benchmark) |

## Citation

```bibtex
@misc{kestrelnet-benchmarks-2026,
  title={KestrelNet/GoshawkNet: Tiny Neural Classifiers for Biosignal and Sensor Data},
  author={Sama Reddy},
  year={2026},
  url={https://huggingface.co/reddysama/kestrelnet-benchmarks}
}
```

---

<p align="center">
<em>No PyTorch. No TensorFlow. No GPU. Just NumPy and math.</em><br>
<a href="https://huggingface.co/reddysama/gnaninet-fraud-classifier">Fraud Classifier</a> · 
<a href="https://huggingface.co/spaces/reddysama/gnaninet-fraud-classifier">Live Demo</a> · 
<a href="https://naninet.ai">Website</a>
</p>