Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.52.2
title: Hand2Voice
emoji: ๐ค
colorFrom: gray
colorTo: indigo
sdk: streamlit
pinned: false
short_description: Converting Hand Gestures into Speech using Computer Vision
๐ค Hand2Voice: AI Sign Language Assistant
Hand2Voice is an accessibility tool designed to bridge the communication gap for the speech-impaired. It uses computer vision to translate hand gestures into spoken audio in real-time.
Unlike basic classifiers that rely on static screen coordinates, this project utilizes Euclidean geometry to calculate relative finger positions, ensuring accurate detection even if the hand is rotated or tilted.
๐ Key Features
- ๐ท Dual Input Modes: Supports real-time camera capture and image uploads.
- ๐ง Robust Recognition Logic: Uses 3D Euclidean distance calculations relative to the wrist, making detection rotation-invariant.
- ๐ฆด Skeletal Visualization: Real-time feedback overlay showing exactly what the computer vision model "sees."
- ๐ฃ๏ธ Text-to-Speech (TTS): Instantly vocalizes the detected gesture using Google TTS.
- ๐ JSON-Based Rule Engine: Gestures are defined in an external
gesture_rules.jsonfile, making it easy to add new signs without changing code.
๐ ๏ธ Tech Stack
- Frontend: Streamlit (Web Interface)
- Computer Vision: MediaPipe Hands (Google)
- Image Processing: OpenCV & NumPy
- Audio: gTTS (Google Text-to-Speech)
๐ Project Structure
Hand2Voice/
โโโ app.py # Main Streamlit application (UI & Logic)
โโโ gesture_classifier.py # Advanced logic using Euclidean distance
โโโ gesture_rules.json # Database of supported gestures
โโโ tts.py # Text-to-Speech helper function
โโโ requirements.txt # List of python dependencies
โโโ NIELIT-LOGO.png # Institution Logo
โโโ README.md # Project Documentation
๐ฟ Installation & Setup
- Clone the Repository
git clone [https://github.com/imarshbir/Hand2Voice.git](https://github.com/imarshbir/Hand2Voice.git)
cd Hand2Voice
- Create a Virtual Environment (Optional but Recommended)
python -m venv venv
# Windows
venv\Scripts\activate
# Mac/Linux
source venv/bin/activate
- Install Dependencies
pip install -r requirements.txt
- Run the Application
streamlit run app.py
๐ค Supported Gestures
The current version supports the following gestures (defined in gesture_rules.json):
| Gesture | Description |
|---|---|
| HELLO | Open palm (All fingers extended) |
| YES / POINT | Index finger raised |
| NO | Closed fist |
| PEACE | Index & Middle fingers raised (V-sign) |
| OK | Thumb & Index touching, others extended |
| ROCK ON | Index & Pinky extended |
| THUMBS UP | Thumb extended only |
๐ฌ How It Works (Technical Deep Dive)
Most basic hand gesture tutorials use simple if y_tip < y_knuckle logic. This fails if the hand is tilted sideways.
Hand2Voice solves this by using Vector Math:
- Landmark Extraction: MediaPipe extracts 21 3D landmarks () for the hand.
- Distance Calculation: We calculate the Euclidean distance between each fingertip and the Wrist (Landmark 0).
- State Determination:
- If , the finger is considered OPEN.
- Otherwise, it is CLOSED.
- Pattern Matching: The resulting binary array (e.g.,
[0, 1, 1, 0, 0]) is compared against the definitions ingesture_rules.json.
๐ฎ Future Scope
- Real-time Video Stream: Integration with
streamlit-webrtcfor continuous streaming without clicking "capture." - Dynamic Gestures: Support for moving gestures (like waving) using LSTM networks.
- Multi-Language Support: Adding Hindi/Punjabi TTS output.
๐จโ๐ป Author
Arshbir Singh
@imarshbir
Expertise: AI, Computer Vision, IoT
B.Tech (CSE) Researcher