Upload MODEL_CARD.md

1272345 verified 4 months ago

4.28 kB

	---
	title: Hand Gesture Recognition
	emoji: 🖐️
	colorFrom: blue
	colorTo: green
	library_name: tensorflow
	license: mit
	tags:
	- computer-vision
	- gesture-recognition
	- lstm
	- mediapipe
	- hand-tracking
	- video-classification
	- tensorflow
	- keras
	- deep-learning
	---

	# Model Card: Hand Gesture Recognition LSTM

	## Model Description

	This model performs real-time hand gesture recognition using LSTM neural networks and MediaPipe hand pose estimation.

	### Model Details

	- Developed by: Abdul Ahad
	- Model type: LSTM Sequential Neural Network
	- Language: TensorFlow/Keras
	- License: MIT
	- Model Architecture: 3-layer LSTM with dense output layers

	## Intended Use

	### Primary Use Cases

	- Real-time hand gesture recognition from webcam feeds
	- Human-computer interaction applications
	- Sign language recognition systems
	- Gesture-controlled interfaces

	### Out-of-Scope Uses

	- Medical diagnosis
	- Security/authentication systems (not designed for this purpose)
	- Applications requiring 100% accuracy in critical scenarios

	## Training Data

	- Dataset: LeapGestRecog (gti-upm/leapgestrecog from Kaggle)
	- Structure: 10 subjects × 10 gestures × multiple video sequences
	- Format: 100 frames per gesture sequence (PNG images)
	- Preprocessing: MediaPipe hand landmark extraction (21 landmarks × 3 coordinates = 63 features)
	- Augmentation: Random noise, occlusion, scaling, and translation (3× data size)

	## Model Architecture

	```
	Input Shape: (30, 63) - 30 frames × 63 features

	Layer 1: LSTM(128, return_sequences=True)
	BatchNormalization + Dropout(0.3)

	Layer 2: LSTM(128, return_sequences=True)
	BatchNormalization + Dropout(0.3)

	Layer 3: LSTM(64)
	BatchNormalization + Dropout(0.3)

	Layer 4: Dense(256, activation='relu')
	BatchNormalization + Dropout(0.3)

	Layer 5: Dense(128, activation='relu')
	BatchNormalization + Dropout(0.3)

	Output: Dense(10, activation='softmax')
	```

	## Training Procedure

	### Hyperparameters

	- Sequence Length: 30 frames
	- LSTM Units: 128 → 128 → 64
	- Dense Units: 256 → 128
	- Dropout Rate: 0.3
	- Batch Size: 32
	- Initial Learning Rate: 0.001
	- Optimizer: Adam with ReduceLROnPlateau
	- Loss Function: Categorical Crossentropy
	- Epochs: Up to 100 (with EarlyStopping)

	### Data Split

	- Training: 64%
	- Validation: 16%
	- Test: 20%

	## Performance

	The model achieves high accuracy on the LeapGestRecog dataset test set. Performance metrics include:

	- Overall accuracy
	- Per-gesture precision, recall, and F1-score
	- Confusion matrix analysis

	See the technical report for detailed performance metrics.

	## Limitations

	1. Lighting Conditions: Performance may degrade in poor lighting
	2. Hand Visibility: Requires clear view of hand landmarks
	3. Background Complexity: May struggle with cluttered backgrounds
	4. Single Hand: Designed for single-hand gestures
	5. Dataset Bias: Trained on specific gesture types from LeapGestRecog

	## How to Use

	### Installation

	```bash
	uv pip install tensorflow mediapipe opencv-python numpy huggingface_hub
	```

	### Inference

	```python
	# Download and run inference
	uv run python inference.py --repo a-01a/hand-gesture-recognition
	```

	Or programmatically:

	```python
	from huggingface_hub import hf_hub_download
	import tensorflow as tf
	import json

	model_path = hf_hub_download(repo_id="a-01a/hand-gesture-recognition",
	filename="hand_gesture_lstm_model.h5")
	mapping_path = hf_hub_download(repo_id="a-01a/hand-gesture-recognition",
	filename="gesture_mapping.json")

	model = tf.keras.models.load_model(model_path)

	with open(mapping_path, 'r') as f:
	gesture_mapping = json.load(f)
	```

	## Citation

	```bibtex
	@misc{hand_gesture_lstm_2025,
	title={Hand Gesture Recognition using LSTM and MediaPipe},
	author={Abdul Ahad},
	year={2025},
	howpublished={https://huggingface.co/a-01a/hand-gesture-recognition},
	note={Real-time hand gesture recognition system using MediaPipe and LSTM networks}
	}
	```