Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Telugu Text Tokenizer
|
| 2 |
|
| 3 |
A Gradio web interface for encoding and decoding Telugu text using a trained BPE tokenizer.
|
|
@@ -34,17 +49,3 @@ The tokenizer is trained on a diverse corpus of Telugu text with:
|
|
| 34 |
- Target compression ratio: ≥ 3.2x
|
| 35 |
- Perfect reconstruction guarantee
|
| 36 |
|
| 37 |
-
---
|
| 38 |
-
- title: Bpe Tokenizer
|
| 39 |
-
- emoji: 🔥
|
| 40 |
-
- colorFrom: blue
|
| 41 |
-
- colorTo: yellow
|
| 42 |
-
- sdk: gradio
|
| 43 |
-
- sdk_version: 5.12.0
|
| 44 |
-
- app_file: app.py
|
| 45 |
-
- pinned: false
|
| 46 |
-
- license: apache-2.0
|
| 47 |
-
- short_description: Telugu BPE tokenizer with vocabulary of 4800 words.
|
| 48 |
-
---
|
| 49 |
-
|
| 50 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Bpe Tokenizer
|
| 3 |
+
emoji: 🔥
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: yellow
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 5.12.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: apache-2.0
|
| 11 |
+
short_description: Telugu BPE tokenizer with vocabulary of 4800 words.
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 15 |
+
|
| 16 |
# Telugu Text Tokenizer
|
| 17 |
|
| 18 |
A Gradio web interface for encoding and decoding Telugu text using a trained BPE tokenizer.
|
|
|
|
| 49 |
- Target compression ratio: ≥ 3.2x
|
| 50 |
- Perfect reconstruction guarantee
|
| 51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|