KalanaPabasara commited on
Commit
4a1077b
Β·
1 Parent(s): 9fe0b67

Make README ASCII-safe to avoid mojibake on web renderers

Browse files
Files changed (1) hide show
  1. README.md +19 -19
README.md CHANGED
@@ -10,7 +10,7 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- # ΰ·ƒΰ·’ΰΆ‚Code β€” Singlish to Sinhala Transliterator
14
 
15
  A model-driven, context-aware back-transliteration system that converts Romanised Sinhala (Singlish) to native Sinhala script.
16
 
@@ -18,25 +18,25 @@ A model-driven, context-aware back-transliteration system that converts Romanise
18
 
19
  ```
20
  Input sentence
21
- β”‚
22
- β–Ό
23
  Word Tokenizer
24
- β”‚
25
- β”œβ”€ Sinhala script? ──────────────────────────► Pass through unchanged
26
- β”‚
27
- β”œβ”€ English vocab (len β‰₯ 3)? ─────────────────► Pass through unchanged
28
- β”‚
29
- └─ Singlish word?
30
- β”‚
31
- β–Ό
32
  ByT5-small seq2seq
33
  (top-5 candidates)
34
- β”‚
35
- β–Ό
36
  XLM-RoBERTa MLM reranker
37
  (contextual scoring)
38
- β”‚
39
- β–Ό
40
  Best candidate
41
  ```
42
 
@@ -44,18 +44,18 @@ Word Tokenizer
44
 
45
  | Model | Role | Hub ID |
46
  |-------|------|--------|
47
- | ByT5-small | Singlish β†’ Sinhala candidate generation | `Kalana001/byt5-small-singlish-sinhala` |
48
  | XLM-RoBERTa | Contextual MLM reranking | `Kalana001/xlm-roberta-base-finetuned-sinhala` |
49
  | mBart50 | Full-sentence Sinhala output mode | `Kalana001/mbart50-large-singlish-sinhala` |
50
 
51
  ## Modes
52
 
53
- - **Code-Mixed Output** β€” Retains English words where contextually appropriate; Singlish words are transliterated using ByT5 + XLM-RoBERTa reranking.
54
- - **Full Sinhala Output** β€” Transliterates the entire sentence to Sinhala script using mBart50.
55
 
56
  ## Environment Variables (optional)
57
 
58
- Set these in HF Spaces β†’ Settings β†’ Repository secrets to enable Supabase feedback storage:
59
 
60
  | Variable | Description |
61
  |----------|-------------|
 
10
  license: mit
11
  ---
12
 
13
+ # SinCode - Singlish to Sinhala Transliterator
14
 
15
  A model-driven, context-aware back-transliteration system that converts Romanised Sinhala (Singlish) to native Sinhala script.
16
 
 
18
 
19
  ```
20
  Input sentence
21
+ |
22
+ v
23
  Word Tokenizer
24
+ |
25
+ +-- Sinhala script? -------------------------> Pass through unchanged
26
+ |
27
+ +-- English vocab (len >= 3)? --------------> Pass through unchanged
28
+ |
29
+ `-- Singlish word?
30
+ |
31
+ v
32
  ByT5-small seq2seq
33
  (top-5 candidates)
34
+ |
35
+ v
36
  XLM-RoBERTa MLM reranker
37
  (contextual scoring)
38
+ |
39
+ v
40
  Best candidate
41
  ```
42
 
 
44
 
45
  | Model | Role | Hub ID |
46
  |-------|------|--------|
47
+ | ByT5-small | Singlish -> Sinhala candidate generation | `Kalana001/byt5-small-singlish-sinhala` |
48
  | XLM-RoBERTa | Contextual MLM reranking | `Kalana001/xlm-roberta-base-finetuned-sinhala` |
49
  | mBart50 | Full-sentence Sinhala output mode | `Kalana001/mbart50-large-singlish-sinhala` |
50
 
51
  ## Modes
52
 
53
+ - **Code-Mixed Output** - Retains English words where contextually appropriate; Singlish words are transliterated using ByT5 + XLM-RoBERTa reranking.
54
+ - **Full Sinhala Output** - Transliterates the entire sentence to Sinhala script using mBart50.
55
 
56
  ## Environment Variables (optional)
57
 
58
+ Set these in HF Spaces -> Settings -> Repository secrets to enable Supabase feedback storage:
59
 
60
  | Variable | Description |
61
  |----------|-------------|