File size: 1,593 Bytes

711c28e

---
license: mit
library_name: pytorch
tags:
- keystroke-dynamics
- lstm
- cybersecurity
- hid
---

# LSTM-5win-Keystrokes

## Summary
This repository contains a PyTorch **LSTM** classifier for **Human vs HID** keystroke control detection using **windowed** timing features.
The label for each window is the **last keystroke** label in that window.

## Training setup (as implemented)
- **Window size:** 5
- **Stride:** 1
- **Label mapping:** {"human": 0, "hid": 1}
- **Window label:** last-keystroke in the window
- **Features (F=10):**
- `duration`
- `DD_time`
- `DU_time`
- `UD_time`
- `UU_time`
- `run_avg_duration`
- `run_avg_DD`
- `run_avg_DU`
- `run_avg_UD`
- `run_avg_UU`

### Preprocessing
- Windows built with NumPy `sliding_window_view`.
- Standardization via `StandardScaler` fitted on **training windows only**, across all timesteps and samples.

### Model
- `torch.nn.LSTM` (unidirectional, batch_first)
- Hidden size: 64
- Num layers: 1
- Dropout: 0.0
- Head: Linear(hidden_size → 2)

### Optimization
- Optimizer: Adam
- LR: 0.001
- Batch size: 256
- Epochs: 30
- Seed: 42

## Files
- `model.safetensors`: model weights
- `config.json`: architecture + feature metadata
- `scaler.joblib`: fitted StandardScaler
- `metrics.json`: classification report + confusion matrix
- `inference.py`: minimal loading + prediction example

## Usage (minimal)
```python
from inference import load_model_and_scaler, predict_df

model, scaler, cfg = load_model_and_scaler("NourFakih/LSTM-5win-Keystrokes")
y_pred = predict_df(df, model, scaler, cfg)  # df must contain cfg["feature_cols"]
```