--- license: mit library_name: pytorch tags: - keystroke-dynamics - lstm - cybersecurity - hid --- # LSTM-16win-Keystrokes ## Summary This repository contains a PyTorch **LSTM** classifier for **Human vs HID** keystroke control detection using **windowed** timing features. The label for each window is the **last keystroke** label in that window. ## Training setup (as implemented) - **Window size:** 16 - **Stride:** 1 - **Label mapping:** {"human": 0, "hid": 1} - **Window label:** last-keystroke in the window - **Features (F=10):** - `duration` - `DD_time` - `DU_time` - `UD_time` - `UU_time` - `run_avg_duration` - `run_avg_DD` - `run_avg_DU` - `run_avg_UD` - `run_avg_UU` ### Preprocessing - Windows built with NumPy `sliding_window_view`. - Standardization via `StandardScaler` fitted on **training windows only**, across all timesteps and samples. ### Model - `torch.nn.LSTM` (unidirectional, batch_first) - Hidden size: 64 - Num layers: 1 - Dropout: 0.0 - Head: Linear(hidden_size → 2) ### Optimization - Optimizer: Adam - LR: 0.001 - Batch size: 256 - Epochs: 30 - Seed: 42 ## Files - `model.safetensors`: model weights - `config.json`: architecture + feature metadata - `scaler.joblib`: fitted StandardScaler - `metrics.json`: classification report + confusion matrix - `inference.py`: minimal loading + prediction example ## Usage (minimal) ```python from inference import load_model_and_scaler, predict_df model, scaler, cfg = load_model_and_scaler("NourFakih/LSTM-16win-Keystrokes") y_pred = predict_df(df, model, scaler, cfg) # df must contain cfg["feature_cols"] ```