Kalana commited on
Commit
a9209c7
·
1 Parent(s): 9906dbd

Update README: evaluation guide, HF Spaces env vars; exclude AI tooling from .gitignore

Browse files
Files changed (2) hide show
  1. .gitignore +4 -0
  2. README.md +54 -0
.gitignore CHANGED
@@ -7,6 +7,10 @@ misc/
7
  *.pkl
8
  !dictionary.pkl
9
 
 
 
 
 
10
  # Training artifacts
11
  train_mlm.py
12
  train_log.txt
 
7
  *.pkl
8
  !dictionary.pkl
9
 
10
+ # AI tooling (Copilot/Claude internal)
11
+ .claude/
12
+ SKILL.md
13
+
14
  # Training artifacts
15
  train_mlm.py
16
  train_log.txt
README.md CHANGED
@@ -31,6 +31,60 @@ Welcome to the interim prototype of **SinCode**, a final-year research project d
31
  3. View the **Result**.
32
  4. (Optional) Expand the **"See How It Works"** section to view the real-time scoring logic used by the system.
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ## 🏗️ System Architecture
35
 
36
  This prototype utilizes a **Tiered Decoding Strategy**:
 
31
  3. View the **Result**.
32
  4. (Optional) Expand the **"See How It Works"** section to view the real-time scoring logic used by the system.
33
 
34
+ ## 📏 Baseline Evaluation (New)
35
+
36
+ Use the evaluation script to measure current model quality before making tuning changes.
37
+
38
+ ### 1) Prepare dataset
39
+
40
+ Create a CSV file with columns:
41
+
42
+ - `input` (Singlish / code-mixed input)
43
+ - `reference` (expected Sinhala output)
44
+
45
+ You can start from `eval_dataset_template.csv`.
46
+
47
+ ### 2) Run evaluation
48
+
49
+ ```bash
50
+ python evaluation.py --dataset eval_dataset_template.csv
51
+ ```
52
+
53
+ Optional:
54
+
55
+ ```bash
56
+ python evaluation.py --dataset your_dataset.csv --beam-width 5 --predictions-out eval_predictions.csv --diagnostics-out eval_diagnostics.json
57
+ ```
58
+
59
+ ### 3) Outputs
60
+
61
+ - `eval_predictions.csv`: per-sample prediction + metrics
62
+ - `eval_diagnostics.json`: per-word candidate scoring breakdown for error analysis
63
+
64
+ Reported aggregate metrics:
65
+
66
+ - Exact match
67
+ - Average Character Error Rate (CER)
68
+ - Average token accuracy
69
+ - Average English code-mix preservation
70
+
71
+ ## 🤗 Hugging Face Spaces Notes
72
+
73
+ This project is compatible with Spaces. You can configure runtime paths with environment variables:
74
+
75
+ - `SICODE_DICTIONARY_PATH` (default: `dictionary.pkl`)
76
+ - `SICODE_MODEL_NAME` (default: `FacebookAI/xlm-roberta-base`)
77
+ - `SICODE_ENGLISH_CACHE` (optional path for `english_20k.txt` cache)
78
+
79
+ Example:
80
+
81
+ ```bash
82
+ SICODE_DICTIONARY_PATH=dictionary.pkl
83
+ SICODE_MODEL_NAME=FacebookAI/xlm-roberta-base
84
+ ```
85
+
86
+ The engine now auto-selects a writable cache path for English corpus downloads when running in restricted environments.
87
+
88
  ## 🏗️ System Architecture
89
 
90
  This prototype utilizes a **Tiered Decoding Strategy**: