File size: 1,593 Bytes
711c28e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
license: mit
library_name: pytorch
tags:
- keystroke-dynamics
- lstm
- cybersecurity
- hid
---

# LSTM-5win-Keystrokes

## Summary
This repository contains a PyTorch **LSTM** classifier for **Human vs HID** keystroke control detection using **windowed** timing features.
The label for each window is the **last keystroke** label in that window.

## Training setup (as implemented)
- **Window size:** 5
- **Stride:** 1
- **Label mapping:** {"human": 0, "hid": 1}
- **Window label:** last-keystroke in the window
- **Features (F=10):**
- `duration`
- `DD_time`
- `DU_time`
- `UD_time`
- `UU_time`
- `run_avg_duration`
- `run_avg_DD`
- `run_avg_DU`
- `run_avg_UD`
- `run_avg_UU`

### Preprocessing
- Windows built with NumPy `sliding_window_view`.
- Standardization via `StandardScaler` fitted on **training windows only**, across all timesteps and samples.

### Model
- `torch.nn.LSTM` (unidirectional, batch_first)
- Hidden size: 64
- Num layers: 1
- Dropout: 0.0
- Head: Linear(hidden_size → 2)

### Optimization
- Optimizer: Adam
- LR: 0.001
- Batch size: 256
- Epochs: 30
- Seed: 42

## Files
- `model.safetensors`: model weights
- `config.json`: architecture + feature metadata
- `scaler.joblib`: fitted StandardScaler
- `metrics.json`: classification report + confusion matrix
- `inference.py`: minimal loading + prediction example

## Usage (minimal)
```python
from inference import load_model_and_scaler, predict_df

model, scaler, cfg = load_model_and_scaler("NourFakih/LSTM-5win-Keystrokes")
y_pred = predict_df(df, model, scaler, cfg)  # df must contain cfg["feature_cols"]
```