luciayen
/

CASL-TransSLR

Video Classification

sign-language-recognition

skeleton-based-action-recognition

Model card Files Files and versions

CASL-TransSLR / README.md

luciayen's picture

Create README.md

d5d5443 verified 12 days ago

|

history blame contribute delete

2.43 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- sign-language-recognition
	- transformer
	- mediapipe
	- skeleton-based-action-recognition
	datasets:
	- luciayen/CASL-W60-Landmarks
	metrics:
	- accuracy
	pipeline_tag: video-classification
	---

	# 🤟 CASL-TransSLR: Robust Sign Language Transformer

	SignVLM-v4 Champion Model

	This repository contains the state-of-the-art Transformer architecture for the CASL (Chinese-American Sign Language) Research Project. This specific version (v4) is optimized for Signer Independence, meaning it is designed to recognize signs from people the model has never seen before.

	## 📊 Performance Metrics (Unseen Signers)
	Evaluated on 862 files from independent signers:

	\| Metric \| Value \|
	\| :--- \| :--- \|
	\| Overall Accuracy \| 80.39% \|
	\| Weighted F1-Score \| 78.33% \|
	\| Classes \| 60 Signs \|

	## 🏗️ Architecture Insight
	The model uses a hybrid Feature Extractor + Transformer Encoder approach:
	* Feature Extractor: A Linear layer (225 → 512) followed by Temporal BatchNorm (64 frames) to normalize motion across time.
	* Transformer: 4 Layers of Multi-Head Attention ($d_{model}=512$, $n_{head}=8$, $ff_{dim}=1024$).
	* Classifier: A 2-layer MLP with Dropout (0.5) for robust generalization.

	## ⚙️ Pre-processing Requirements
	IMPORTANT: This model expects landmarks to be normalized. If you pass raw MediaPipe coordinates, the accuracy will drop significantly.
	1. Centering: Translate all points relative to the Mid-Hip (Point 0).
	2. Scaling: Normalize by the Shoulder-to-Shoulder distance to account for different body types.
	3. Shape: Input must be a tensor of shape `(Batch, 64, 225)`.

	## 🚀 How to Load and Use

	```python
	import torch
	from huggingface_hub import hf_hub_download
	import importlib.util

	# 1. Download files
	repo_id = "luciayen/CASL-TransSLR"
	model_bin = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
	model_script = hf_hub_download(repo_id=repo_id, filename="model.py")

	# 2. Import architecture
	spec = importlib.util.spec_from_file_location("model_arch", model_script)
	model_arch = importlib.util.module_from_spec(spec)
	spec.loader.exec_module(model_arch)

	# 3. Initialize & Load
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model = model_arch.SignVLM().to(device)
	model.load_state_dict(torch.load(model_bin, map_location=device))
	model.eval()