luciayen
/

CASL-TransSLR

Video Classification

sign-language-recognition

skeleton-based-action-recognition

Model card Files Files and versions

luciayen commited on Mar 10

Commit

d5d5443

·

verified ·

1 Parent(s): e7e94b2

Create README.md

Files changed (1) hide show

README.md +64 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+license: mit
+library_name: pytorch
+tags:
+- sign-language-recognition
+- transformer
+- mediapipe
+- skeleton-based-action-recognition
+datasets:
+- luciayen/CASL-W60-Landmarks
+metrics:
+- accuracy
+pipeline_tag: video-classification
+---
+# 🤟 CASL-TransSLR: Robust Sign Language Transformer
+**SignVLM-v4 Champion Model**
+This repository contains the state-of-the-art Transformer architecture for the **CASL (Chinese-American Sign Language) Research Project**. This specific version (v4) is optimized for **Signer Independence**, meaning it is designed to recognize signs from people the model has never seen before.
+## 📊 Performance Metrics (Unseen Signers)
+Evaluated on **862 files** from independent signers:
+| Metric | Value |
+| :--- | :--- |
+| **Overall Accuracy** | **80.39%** |
+| **Weighted F1-Score** | **78.33%** |
+| **Classes** | 60 Signs |
+## 🏗️ Architecture Insight
+The model uses a hybrid **Feature Extractor + Transformer Encoder** approach:
+* **Feature Extractor:** A Linear layer (225 → 512) followed by **Temporal BatchNorm** (64 frames) to normalize motion across time.
+* **Transformer:** 4 Layers of Multi-Head Attention ($d_{model}=512$, $n_{head}=8$, $ff_{dim}=1024$).
+* **Classifier:** A 2-layer MLP with Dropout (0.5) for robust generalization.
+## ⚙️ Pre-processing Requirements
+**IMPORTANT:** This model expects landmarks to be normalized. If you pass raw MediaPipe coordinates, the accuracy will drop significantly.
+1. **Centering:** Translate all points relative to the **Mid-Hip** (Point 0).
+2. **Scaling:** Normalize by the **Shoulder-to-Shoulder** distance to account for different body types.
+3. **Shape:** Input must be a tensor of shape `(Batch, 64, 225)`.
+## 🚀 How to Load and Use
+```python
+import torch
+from huggingface_hub import hf_hub_download
+import importlib.util
+# 1. Download files
+repo_id = "luciayen/CASL-TransSLR"
+model_bin = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
+model_script = hf_hub_download(repo_id=repo_id, filename="model.py")
+# 2. Import architecture
+spec = importlib.util.spec_from_file_location("model_arch", model_script)
+model_arch = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(model_arch)
+# 3. Initialize & Load
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+model = model_arch.SignVLM().to(device)
+model.load_state_dict(torch.load(model_bin, map_location=device))
+model.eval()