|
|
--- |
|
|
datasets: |
|
|
- CSL-Daily |
|
|
- CSL-News |
|
|
language: |
|
|
- zh |
|
|
library_name: transformers |
|
|
license: mit |
|
|
model_name: Geo-Sign (Hyperbolic-Token) |
|
|
tags: |
|
|
- sign-language-translation |
|
|
- skeleton-based |
|
|
- hyperbolic-geometry |
|
|
- mT5 |
|
|
paperswithcode_id: geo-sign-hyperbolic-contrastive-regularisation |
|
|
task: |
|
|
- sign-language-translation |
|
|
pipeline_tag: video-text-to-text |
|
|
--- |
|
|
|
|
|
# Geo-Sign πβ β π |
|
|
**Hyperbolic Contrastive Regularisation for Geometrically-Aware Sign-Language Translation** |
|
|
|
|
|
**Paper**: *Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign-Language Translation* |
|
|
Edward Fish, Richard Bowden, CVSSP β University of Surrey (arXiv:2506.00129, May 2025) |
|
|
**Code**: <https://github.com/ed-fish/geo-sign> |
|
|
**Paper** <https://arxiv.org/pdf/2506.00129v1> |
|
|
**NeurIPS 2025** |
|
|
|
|
|
## Code Use |
|
|
Download the weights and data labels from the files section of this repo and add them to the github repository <https://github.com/ed-fish/geo-sign>. |
|
|
|
|
|
You will also need the base-mt5 model from <https://huggingface.co/google/mt5-base> and put it in the pretrained_weight folder. |
|
|
|
|
|
`Data -> ./Data` |
|
|
|
|
|
`best.pth -> ./checkpoints/best.pth` |
|
|
|
|
|
`pretraining.pth -> ./checkpoints/pretraining.pth` |
|
|
|
|
|
`<https://huggingface.co/google/mt5-base> -> ./pretrained_weight` |
|
|
|
|
|
## TL;DR |
|
|
Geo-Sign projects pose-based sign-language features into a learnable **PoincarΓ© ball** and aligns them with text embeddings via a geometric contrastive loss. |
|
|
Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only. |
|
|
|
|
|
## Intended Uses & Scope |
|
|
* **Primary** β Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable. |
|
|
* **Out-of-scope** β Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on. |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
| Dataset | Modality | BLEU-4 β | ROUGE-L β | |
|
|
|------------------|----------|----------|-----------| |
|
|
| CSL-Daily (test) | Pose-only | **27.42** | **57.95** | |
|
|
|
|
|
Geo-Sign outperforms all previous gloss-free pose-only methods and rivals many RGB- or gloss-based systems. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations & Ethical Considerations |
|
|
|
|
|
* **Pose-estimation dependency** β Errors in upstream key-points propagate to the translation. |
|
|
* **Training latency** β Hyperbolic operations slow training (~4β6 Γ) but add **no** cost at inference. |
|
|
* **Generalisation** β Evaluated only on Chinese Sign Language; other sign languages are not guaranteed. |
|
|
* **Mis-translation risk** β Automatic SLT can mis-communicate; keep a human in the loop for critical use cases. |
|
|
* **Biases** β CSL-Daily is domain-specific (news/TV); outputs may reflect that linguistic style. |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{fish2025geo, |
|
|
title={Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation}, |
|
|
author={Fish, Edward and Bowden, Richard}, |
|
|
journal={arXiv preprint arXiv:2506.00129}, |
|
|
year={2025} |
|
|
}``` |