File size: 2,425 Bytes
d5d5443
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
license: mit
library_name: pytorch
tags:
- sign-language-recognition
- transformer
- mediapipe
- skeleton-based-action-recognition
datasets:
- luciayen/CASL-W60-Landmarks
metrics:
- accuracy
pipeline_tag: video-classification
---

# ๐ŸคŸ CASL-TransSLR: Robust Sign Language Transformer

**SignVLM-v4 Champion Model**

This repository contains the state-of-the-art Transformer architecture for the **CASL (Chinese-American Sign Language) Research Project**. This specific version (v4) is optimized for **Signer Independence**, meaning it is designed to recognize signs from people the model has never seen before.

## ๐Ÿ“Š Performance Metrics (Unseen Signers)
Evaluated on **862 files** from independent signers:

| Metric | Value |
| :--- | :--- |
| **Overall Accuracy** | **80.39%** |
| **Weighted F1-Score** | **78.33%** |
| **Classes** | 60 Signs |

## ๐Ÿ—๏ธ Architecture Insight
The model uses a hybrid **Feature Extractor + Transformer Encoder** approach:
* **Feature Extractor:** A Linear layer (225 โ†’ 512) followed by **Temporal BatchNorm** (64 frames) to normalize motion across time.
* **Transformer:** 4 Layers of Multi-Head Attention ($d_{model}=512$, $n_{head}=8$, $ff_{dim}=1024$).
* **Classifier:** A 2-layer MLP with Dropout (0.5) for robust generalization.

## โš™๏ธ Pre-processing Requirements
**IMPORTANT:** This model expects landmarks to be normalized. If you pass raw MediaPipe coordinates, the accuracy will drop significantly.
1. **Centering:** Translate all points relative to the **Mid-Hip** (Point 0).
2. **Scaling:** Normalize by the **Shoulder-to-Shoulder** distance to account for different body types.
3. **Shape:** Input must be a tensor of shape `(Batch, 64, 225)`.

## ๐Ÿš€ How to Load and Use

```python
import torch
from huggingface_hub import hf_hub_download
import importlib.util

# 1. Download files
repo_id = "luciayen/CASL-TransSLR"
model_bin = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
model_script = hf_hub_download(repo_id=repo_id, filename="model.py")

# 2. Import architecture
spec = importlib.util.spec_from_file_location("model_arch", model_script)
model_arch = importlib.util.module_from_spec(spec)
spec.loader.exec_module(model_arch)

# 3. Initialize & Load
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model_arch.SignVLM().to(device)
model.load_state_dict(torch.load(model_bin, map_location=device))
model.eval()