fiskenai commited on
Commit
60f1f64
·
verified ·
1 Parent(s): 4aef6ba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -9
README.md CHANGED
@@ -32,15 +32,6 @@ Download the weights and data labels from the files section of this repo and add
32
  Geo-Sign projects pose-based sign-language features into a learnable **Poincaré ball** and aligns them with text embeddings via a geometric contrastive loss.
33
  Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only.
34
 
35
- ## Model Details
36
- | | |
37
- | **Backbone** | Four part-specific **ST-GCNs** (body / L-hand / R-hand / face) feeding an mT5-Base decoder |
38
- | **Hyperbolic branch** | • Learnable curvature \(c\) (init 1.5) • 256-D Poincaré embeddings • Weighted Fréchet-mean pooling (global) or Token-level hyperbolic attention (this checkpoint = **Token**) |
39
- | **Train data** | Pre-trained pose encoder on **CSL-News** (1 985 h) then fine-tuned 40 epochs on **CSL-Daily** (20 k videos) |
40
- | **Objective** | Cross-entropy translation + hyperbolic InfoNCE (α = 0.7) with Riemannian Adam optimisation |
41
- | **Params** | 589 M (adds < 0.25 % over Uni-Sign) |
42
- | **Frameworks** | PyTorch 2 · Hugging Face Transformers · Geoopt |
43
-
44
  ## Intended Uses & Scope
45
  * **Primary** – Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable.
46
  * **Out-of-scope** – Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on.
 
32
  Geo-Sign projects pose-based sign-language features into a learnable **Poincaré ball** and aligns them with text embeddings via a geometric contrastive loss.
33
  Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only.
34
 
 
 
 
 
 
 
 
 
 
35
  ## Intended Uses & Scope
36
  * **Primary** – Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable.
37
  * **Out-of-scope** – Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on.