hand-gesture-recognition / MODEL_CARD.md
a-01a's picture
Upload MODEL_CARD.md
1272345 verified
---
title: Hand Gesture Recognition
emoji: ๐Ÿ–๏ธ
colorFrom: blue
colorTo: green
library_name: tensorflow
license: mit
tags:
- computer-vision
- gesture-recognition
- lstm
- mediapipe
- hand-tracking
- video-classification
- tensorflow
- keras
- deep-learning
---
# Model Card: Hand Gesture Recognition LSTM
## Model Description
This model performs real-time hand gesture recognition using LSTM neural networks and MediaPipe hand pose estimation.
### Model Details
- **Developed by:** Abdul Ahad
- **Model type:** LSTM Sequential Neural Network
- **Language:** TensorFlow/Keras
- **License:** MIT
- **Model Architecture:** 3-layer LSTM with dense output layers
## Intended Use
### Primary Use Cases
- Real-time hand gesture recognition from webcam feeds
- Human-computer interaction applications
- Sign language recognition systems
- Gesture-controlled interfaces
### Out-of-Scope Uses
- Medical diagnosis
- Security/authentication systems (not designed for this purpose)
- Applications requiring 100% accuracy in critical scenarios
## Training Data
- **Dataset:** LeapGestRecog (gti-upm/leapgestrecog from Kaggle)
- **Structure:** 10 subjects ร— 10 gestures ร— multiple video sequences
- **Format:** 100 frames per gesture sequence (PNG images)
- **Preprocessing:** MediaPipe hand landmark extraction (21 landmarks ร— 3 coordinates = 63 features)
- **Augmentation:** Random noise, occlusion, scaling, and translation (3ร— data size)
## Model Architecture
```
Input Shape: (30, 63) - 30 frames ร— 63 features
Layer 1: LSTM(128, return_sequences=True)
BatchNormalization + Dropout(0.3)
Layer 2: LSTM(128, return_sequences=True)
BatchNormalization + Dropout(0.3)
Layer 3: LSTM(64)
BatchNormalization + Dropout(0.3)
Layer 4: Dense(256, activation='relu')
BatchNormalization + Dropout(0.3)
Layer 5: Dense(128, activation='relu')
BatchNormalization + Dropout(0.3)
Output: Dense(10, activation='softmax')
```
## Training Procedure
### Hyperparameters
- **Sequence Length:** 30 frames
- **LSTM Units:** 128 โ†’ 128 โ†’ 64
- **Dense Units:** 256 โ†’ 128
- **Dropout Rate:** 0.3
- **Batch Size:** 32
- **Initial Learning Rate:** 0.001
- **Optimizer:** Adam with ReduceLROnPlateau
- **Loss Function:** Categorical Crossentropy
- **Epochs:** Up to 100 (with EarlyStopping)
### Data Split
- **Training:** 64%
- **Validation:** 16%
- **Test:** 20%
## Performance
The model achieves high accuracy on the LeapGestRecog dataset test set. Performance metrics include:
- Overall accuracy
- Per-gesture precision, recall, and F1-score
- Confusion matrix analysis
See the technical report for detailed performance metrics.
## Limitations
1. **Lighting Conditions:** Performance may degrade in poor lighting
2. **Hand Visibility:** Requires clear view of hand landmarks
3. **Background Complexity:** May struggle with cluttered backgrounds
4. **Single Hand:** Designed for single-hand gestures
5. **Dataset Bias:** Trained on specific gesture types from LeapGestRecog
## How to Use
### Installation
```bash
uv pip install tensorflow mediapipe opencv-python numpy huggingface_hub
```
### Inference
```python
# Download and run inference
uv run python inference.py --repo a-01a/hand-gesture-recognition
```
Or programmatically:
```python
from huggingface_hub import hf_hub_download
import tensorflow as tf
import json
model_path = hf_hub_download(repo_id="a-01a/hand-gesture-recognition",
filename="hand_gesture_lstm_model.h5")
mapping_path = hf_hub_download(repo_id="a-01a/hand-gesture-recognition",
filename="gesture_mapping.json")
model = tf.keras.models.load_model(model_path)
with open(mapping_path, 'r') as f:
gesture_mapping = json.load(f)
```
## Citation
```bibtex
@misc{hand_gesture_lstm_2025,
title={Hand Gesture Recognition using LSTM and MediaPipe},
author={Abdul Ahad},
year={2025},
howpublished={https://huggingface.co/a-01a/hand-gesture-recognition},
note={Real-time hand gesture recognition system using MediaPipe and LSTM networks}
}
```