--- datasets: - CSL-Daily - CSL-News language: - zh library_name: transformers license: mit model_name: Geo-Sign (Hyperbolic-Token) tags: - sign-language-translation - skeleton-based - hyperbolic-geometry - mT5 paperswithcode_id: geo-sign-hyperbolic-contrastive-regularisation task: - sign-language-translation pipeline_tag: video-text-to-text --- # Geo-Sign πŸŒβœ‹ β†’ πŸ“ **Hyperbolic Contrastive Regularisation for Geometrically-Aware Sign-Language Translation** **Paper**: *Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign-Language Translation* Edward Fish, Richard Bowden, CVSSP – University of Surrey (arXiv:2506.00129, May 2025) **Code**: **Paper** **NeurIPS 2025** ## Code Use Download the weights and data labels from the files section of this repo and add them to the github repository . You will also need the base-mt5 model from and put it in the pretrained_weight folder. `Data -> ./Data` `best.pth -> ./checkpoints/best.pth` `pretraining.pth -> ./checkpoints/pretraining.pth` ` -> ./pretrained_weight` ## TL;DR Geo-Sign projects pose-based sign-language features into a learnable **PoincarΓ© ball** and aligns them with text embeddings via a geometric contrastive loss. Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only. ## Intended Uses & Scope * **Primary** – Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable. * **Out-of-scope** – Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on. ## Evaluation | Dataset | Modality | BLEU-4 ↑ | ROUGE-L ↑ | |------------------|----------|----------|-----------| | CSL-Daily (test) | Pose-only | **27.42** | **57.95** | Geo-Sign outperforms all previous gloss-free pose-only methods and rivals many RGB- or gloss-based systems. --- ## Limitations & Ethical Considerations * **Pose-estimation dependency** – Errors in upstream key-points propagate to the translation. * **Training latency** – Hyperbolic operations slow training (~4–6 Γ—) but add **no** cost at inference. * **Generalisation** – Evaluated only on Chinese Sign Language; other sign languages are not guaranteed. * **Mis-translation risk** – Automatic SLT can mis-communicate; keep a human in the loop for critical use cases. * **Biases** – CSL-Daily is domain-specific (news/TV); outputs may reflect that linguistic style. --- ## Citation ```bibtex @article{fish2025geo, title={Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation}, author={Fish, Edward and Bowden, Richard}, journal={arXiv preprint arXiv:2506.00129}, year={2025} }```