Spaces:
Sleeping
Sleeping
Commit
·
906dedd
1
Parent(s):
788178f
updates
Browse files
app.py
CHANGED
|
@@ -77,11 +77,14 @@ def translate(video_file, true_caption=None):
|
|
| 77 |
title = "American Sign Language Translation: An Approach Combining MoViNets and T5"
|
| 78 |
|
| 79 |
description = """
|
| 80 |
-
This application
|
| 81 |
The model comprises of a fine-tuned MoViNet CNN model to generate video embeddings and a T5 encoder-decoder model
|
| 82 |
to generate translations from the video embeddings. This model architecture achieves a BLEU score of 1.98
|
| 83 |
and an average cosine similarity score of 0.21 when trained and evaluated on the YouTube-ASL dataset.
|
| 84 |
-
More information about the model training and instructions to download the models
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
A limitation of this architecture is the size of the MoViNets model, making it especially slow during inference on a CPU.
|
| 87 |
We do not recommend uploading videos longer than 4 seconds as the video embedding generation may take some time.
|
|
|
|
| 77 |
title = "American Sign Language Translation: An Approach Combining MoViNets and T5"
|
| 78 |
|
| 79 |
description = """
|
| 80 |
+
This application hosts a model for translation of American Sign Language (ASL).
|
| 81 |
The model comprises of a fine-tuned MoViNet CNN model to generate video embeddings and a T5 encoder-decoder model
|
| 82 |
to generate translations from the video embeddings. This model architecture achieves a BLEU score of 1.98
|
| 83 |
and an average cosine similarity score of 0.21 when trained and evaluated on the YouTube-ASL dataset.
|
| 84 |
+
More information about the model training and instructions to download the models
|
| 85 |
+
can be found in our <a href=https://github.com/deanna-emery/ASL-Translator>GitHub repository</a>.
|
| 86 |
+
You can also find a overview of the project approach
|
| 87 |
+
<a href=https://www.ischool.berkeley.edu/projects/2023/signsense-american-sign-language-translation>here/a>.
|
| 88 |
|
| 89 |
A limitation of this architecture is the size of the MoViNets model, making it especially slow during inference on a CPU.
|
| 90 |
We do not recommend uploading videos longer than 4 seconds as the video embedding generation may take some time.
|