Navya-Sree commited on
Commit
30883a3
Β·
verified Β·
1 Parent(s): a0e4a5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -45
README.md CHANGED
@@ -9,57 +9,43 @@ app_file: app.py
9
  pinned: false
10
  ---
11
  ---
12
- language:
13
- - multilingual
14
- - endangered-languages
15
- tags:
16
- - translation
17
- - unesco
18
- - m2m100
 
 
19
  license: mit
20
- datasets:
21
- - UNESCO language vitality data
22
- metrics:
23
- - BLEU
24
- - chrF++
25
  ---
26
 
27
- # UNESCO Language Translator 🌍
28
-
29
- **A specialized translation model for UNESCO's endangered languages** powered by Meta's M2M100 and Hugging Face.
30
-
31
- ## Key Features
32
- - πŸ” **Endangered Language Focus**: 35+ UNESCO-protected languages
33
- - ⚑️ **Context-Aware Translation**: Preserves cultural context
34
- - πŸ“Š **Language Vitality Tags**: Shows preservation status
35
- - 🀝 **Community Feedback**: Crowdsourced quality improvement
36
 
37
- ## Supported Languages
38
- | Language | ISO Code | Vitality Level |
39
- |----------|----------|----------------|
40
- | Aymara | ay | Vulnerable |
41
- | Cherokee | chr | Definitely Endangered |
42
- | Quechua | qu | Vulnerable |
43
- | ... | ... | ... |
44
 
45
- [See full list](https://unesco.org/languages)
46
 
47
- ## Usage
48
- ```python
49
- from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
 
 
50
 
51
- model = M2M100ForConditionalGeneration.from_pretrained("unesco/translator")
52
- tokenizer = M2M100Tokenizer.from_pretrained("unesco/translator")
 
 
 
 
 
 
53
 
54
- def translate(text, target_lang):
55
- tokenizer.src_lang = "auto"
56
- encoded = tokenizer(text, return_tensors="pt")
57
- generated_tokens = model.generate(
58
- **encoded,
59
- forced_bos_token_id=tokenizer.get_lang_id(target_lang),
60
- cultural_preservation=True # Unique feature!
61
- )
62
- return tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
63
 
64
- translate("Traditional knowledge matters", "qu")
65
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
9
  pinned: false
10
  ---
11
  ---
12
+ ---
13
+ title: UNESCO Language Translator 🌍
14
+ emoji: πŸ—£οΈ
15
+ colorFrom: blue
16
+ colorTo: green
17
+ sdk: gradio
18
+ sdk_version: 3.39.0
19
+ app_file: app.py
20
+ pinned: true
21
  license: mit
 
 
 
 
 
22
  ---
23
 
24
+ # UNESCO Language Translator
 
 
 
 
 
 
 
 
25
 
26
+ ## Preserving Linguistic Heritage with AI
 
 
 
 
 
 
27
 
28
+ This space provides translations for endangered languages identified by UNESCO, using Meta's M2M100 model with cultural preservation enhancements.
29
 
30
+ ### Key Features:
31
+ - Cultural context preservation for endangered languages
32
+ - Special handling of linguistic nuances
33
+ - Ethical translation guidelines
34
+ - Support for 7+ endangered languages
35
 
36
+ ### Supported Endangered Languages:
37
+ - Quechua (Vulnerable)
38
+ - Aymara (Vulnerable)
39
+ - Cherokee (Endangered)
40
+ - Navajo (Vulnerable)
41
+ - Inuktitut (Vulnerable)
42
+ - Sami (Endangered)
43
+ - Welsh (Vulnerable)
44
 
45
+ ## Ethical Use
46
+ Please follow UNESCO's guidelines when using translations:
47
+ 1. Always credit original knowledge sources
48
+ 2. Verify translations with native speakers
49
+ 3. Use for cultural preservation, not exploitation
 
 
 
 
50
 
51
+ [UNESCO Language Preservation Guidelines](https://en.unesco.org/themes/endangered-languages)