--- license: mit library_name: pytorch tags: - sign-language-recognition - transformer - mediapipe - skeleton-based-action-recognition datasets: - luciayen/CASL-W60-Landmarks metrics: - accuracy pipeline_tag: video-classification --- # 🤟 CASL-TransSLR: Robust Sign Language Transformer **SignVLM-v4 Champion Model** This repository contains the state-of-the-art Transformer architecture for the **CASL (Chinese-American Sign Language) Research Project**. This specific version (v4) is optimized for **Signer Independence**, meaning it is designed to recognize signs from people the model has never seen before. ## 📊 Performance Metrics (Unseen Signers) Evaluated on **862 files** from independent signers: | Metric | Value | | :--- | :--- | | **Overall Accuracy** | **80.39%** | | **Weighted F1-Score** | **78.33%** | | **Classes** | 60 Signs | ## 🏗️ Architecture Insight The model uses a hybrid **Feature Extractor + Transformer Encoder** approach: * **Feature Extractor:** A Linear layer (225 → 512) followed by **Temporal BatchNorm** (64 frames) to normalize motion across time. * **Transformer:** 4 Layers of Multi-Head Attention ($d_{model}=512$, $n_{head}=8$, $ff_{dim}=1024$). * **Classifier:** A 2-layer MLP with Dropout (0.5) for robust generalization. ## ⚙️ Pre-processing Requirements **IMPORTANT:** This model expects landmarks to be normalized. If you pass raw MediaPipe coordinates, the accuracy will drop significantly. 1. **Centering:** Translate all points relative to the **Mid-Hip** (Point 0). 2. **Scaling:** Normalize by the **Shoulder-to-Shoulder** distance to account for different body types. 3. **Shape:** Input must be a tensor of shape `(Batch, 64, 225)`. ## 🚀 How to Load and Use ```python import torch from huggingface_hub import hf_hub_download import importlib.util # 1. Download files repo_id = "luciayen/CASL-TransSLR" model_bin = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin") model_script = hf_hub_download(repo_id=repo_id, filename="model.py") # 2. Import architecture spec = importlib.util.spec_from_file_location("model_arch", model_script) model_arch = importlib.util.module_from_spec(spec) spec.loader.exec_module(model_arch) # 3. Initialize & Load device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = model_arch.SignVLM().to(device) model.load_state_dict(torch.load(model_bin, map_location=device)) model.eval()