luciayen commited on
Commit
d5d5443
·
verified ·
1 Parent(s): e7e94b2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: pytorch
4
+ tags:
5
+ - sign-language-recognition
6
+ - transformer
7
+ - mediapipe
8
+ - skeleton-based-action-recognition
9
+ datasets:
10
+ - luciayen/CASL-W60-Landmarks
11
+ metrics:
12
+ - accuracy
13
+ pipeline_tag: video-classification
14
+ ---
15
+
16
+ # 🤟 CASL-TransSLR: Robust Sign Language Transformer
17
+
18
+ **SignVLM-v4 Champion Model**
19
+
20
+ This repository contains the state-of-the-art Transformer architecture for the **CASL (Chinese-American Sign Language) Research Project**. This specific version (v4) is optimized for **Signer Independence**, meaning it is designed to recognize signs from people the model has never seen before.
21
+
22
+ ## 📊 Performance Metrics (Unseen Signers)
23
+ Evaluated on **862 files** from independent signers:
24
+
25
+ | Metric | Value |
26
+ | :--- | :--- |
27
+ | **Overall Accuracy** | **80.39%** |
28
+ | **Weighted F1-Score** | **78.33%** |
29
+ | **Classes** | 60 Signs |
30
+
31
+ ## 🏗️ Architecture Insight
32
+ The model uses a hybrid **Feature Extractor + Transformer Encoder** approach:
33
+ * **Feature Extractor:** A Linear layer (225 → 512) followed by **Temporal BatchNorm** (64 frames) to normalize motion across time.
34
+ * **Transformer:** 4 Layers of Multi-Head Attention ($d_{model}=512$, $n_{head}=8$, $ff_{dim}=1024$).
35
+ * **Classifier:** A 2-layer MLP with Dropout (0.5) for robust generalization.
36
+
37
+ ## ⚙️ Pre-processing Requirements
38
+ **IMPORTANT:** This model expects landmarks to be normalized. If you pass raw MediaPipe coordinates, the accuracy will drop significantly.
39
+ 1. **Centering:** Translate all points relative to the **Mid-Hip** (Point 0).
40
+ 2. **Scaling:** Normalize by the **Shoulder-to-Shoulder** distance to account for different body types.
41
+ 3. **Shape:** Input must be a tensor of shape `(Batch, 64, 225)`.
42
+
43
+ ## 🚀 How to Load and Use
44
+
45
+ ```python
46
+ import torch
47
+ from huggingface_hub import hf_hub_download
48
+ import importlib.util
49
+
50
+ # 1. Download files
51
+ repo_id = "luciayen/CASL-TransSLR"
52
+ model_bin = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
53
+ model_script = hf_hub_download(repo_id=repo_id, filename="model.py")
54
+
55
+ # 2. Import architecture
56
+ spec = importlib.util.spec_from_file_location("model_arch", model_script)
57
+ model_arch = importlib.util.module_from_spec(spec)
58
+ spec.loader.exec_module(model_arch)
59
+
60
+ # 3. Initialize & Load
61
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
62
+ model = model_arch.SignVLM().to(device)
63
+ model.load_state_dict(torch.load(model_bin, map_location=device))
64
+ model.eval()