nielsr HF Staff commited on
Commit
4659bf4
Β·
verified Β·
1 Parent(s): 3059c07

Add pipeline tag: video-text-to-text

Browse files

This PR enhances the model card by adding the `pipeline_tag: video-text-to-text` to the metadata. This tag accurately categorizes the model's functionality of translating sign language (from visual input, specifically pose data) into text, improving its discoverability on the Hugging Face Hub.

The existing `library_name: transformers` and `license: mit` fields are retained as they are validated by the GitHub repository's content. No sample usage section is added, adhering to the guidelines that require explicit code snippets from the original repository's README for inference.

Files changed (1) hide show
  1. README.md +20 -20
README.md CHANGED
@@ -1,20 +1,21 @@
1
  ---
 
 
 
 
 
2
  library_name: transformers
3
  license: mit
4
  model_name: Geo-Sign (Hyperbolic-Token)
5
- paperswithcode_id: geo-sign-hyperbolic-contrastive-regularisation
6
  tags:
7
- - sign-language-translation
8
- - skeleton-based
9
- - hyperbolic-geometry
10
- - mT5
11
- datasets:
12
- - CSL-Daily
13
- - CSL-News
14
- language:
15
- - zh
16
  task:
17
- - sign-language-translation
 
18
  ---
19
 
20
  # Geo-Sign πŸŒβœ‹ β†’ πŸ“
@@ -43,8 +44,8 @@ Geo-Sign projects pose-based sign-language features into a learnable **PoincarΓ©
43
  Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only.
44
 
45
  ## Intended Uses & Scope
46
- * **Primary** – Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable.
47
- * **Out-of-scope** – Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on.
48
 
49
  ## Evaluation
50
 
@@ -58,11 +59,11 @@ Geo-Sign outperforms all previous gloss-free pose-only methods and rivals many R
58
 
59
  ## Limitations & Ethical Considerations
60
 
61
- * **Pose-estimation dependency** – Errors in upstream key-points propagate to the translation.
62
- * **Training latency** – Hyperbolic operations slow training (~4–6 Γ—) but add **no** cost at inference.
63
- * **Generalisation** – Evaluated only on Chinese Sign Language; other sign languages are not guaranteed.
64
- * **Mis-translation risk** – Automatic SLT can mis-communicate; keep a human in the loop for critical use cases.
65
- * **Biases** – CSL-Daily is domain-specific (news/TV); outputs may reflect that linguistic style.
66
 
67
  ---
68
 
@@ -74,5 +75,4 @@ Geo-Sign outperforms all previous gloss-free pose-only methods and rivals many R
74
  author={Fish, Edward and Bowden, Richard},
75
  journal={arXiv preprint arXiv:2506.00129},
76
  year={2025}
77
- }```
78
-
 
1
  ---
2
+ datasets:
3
+ - CSL-Daily
4
+ - CSL-News
5
+ language:
6
+ - zh
7
  library_name: transformers
8
  license: mit
9
  model_name: Geo-Sign (Hyperbolic-Token)
 
10
  tags:
11
+ - sign-language-translation
12
+ - skeleton-based
13
+ - hyperbolic-geometry
14
+ - mT5
15
+ paperswithcode_id: geo-sign-hyperbolic-contrastive-regularisation
 
 
 
 
16
  task:
17
+ - sign-language-translation
18
+ pipeline_tag: video-text-to-text
19
  ---
20
 
21
  # Geo-Sign πŸŒβœ‹ β†’ πŸ“
 
44
  Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only.
45
 
46
  ## Intended Uses & Scope
47
+ * **Primary** – Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable.
48
+ * **Out-of-scope** – Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on.
49
 
50
  ## Evaluation
51
 
 
59
 
60
  ## Limitations & Ethical Considerations
61
 
62
+ * **Pose-estimation dependency** – Errors in upstream key-points propagate to the translation.
63
+ * **Training latency** – Hyperbolic operations slow training (~4–6 Γ—) but add **no** cost at inference.
64
+ * **Generalisation** – Evaluated only on Chinese Sign Language; other sign languages are not guaranteed.
65
+ * **Mis-translation risk** – Automatic SLT can mis-communicate; keep a human in the loop for critical use cases.
66
+ * **Biases** – CSL-Daily is domain-specific (news/TV); outputs may reflect that linguistic style.
67
 
68
  ---
69
 
 
75
  author={Fish, Edward and Bowden, Richard},
76
  journal={arXiv preprint arXiv:2506.00129},
77
  year={2025}
78
+ }```