Update README: evaluation guide, HF Spaces env vars; exclude AI tooling from .gitignore
Browse files- .gitignore +4 -0
- README.md +54 -0
.gitignore
CHANGED
|
@@ -7,6 +7,10 @@ misc/
|
|
| 7 |
*.pkl
|
| 8 |
!dictionary.pkl
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
# Training artifacts
|
| 11 |
train_mlm.py
|
| 12 |
train_log.txt
|
|
|
|
| 7 |
*.pkl
|
| 8 |
!dictionary.pkl
|
| 9 |
|
| 10 |
+
# AI tooling (Copilot/Claude internal)
|
| 11 |
+
.claude/
|
| 12 |
+
SKILL.md
|
| 13 |
+
|
| 14 |
# Training artifacts
|
| 15 |
train_mlm.py
|
| 16 |
train_log.txt
|
README.md
CHANGED
|
@@ -31,6 +31,60 @@ Welcome to the interim prototype of **SinCode**, a final-year research project d
|
|
| 31 |
3. View the **Result**.
|
| 32 |
4. (Optional) Expand the **"See How It Works"** section to view the real-time scoring logic used by the system.
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
## 🏗️ System Architecture
|
| 35 |
|
| 36 |
This prototype utilizes a **Tiered Decoding Strategy**:
|
|
|
|
| 31 |
3. View the **Result**.
|
| 32 |
4. (Optional) Expand the **"See How It Works"** section to view the real-time scoring logic used by the system.
|
| 33 |
|
| 34 |
+
## 📏 Baseline Evaluation (New)
|
| 35 |
+
|
| 36 |
+
Use the evaluation script to measure current model quality before making tuning changes.
|
| 37 |
+
|
| 38 |
+
### 1) Prepare dataset
|
| 39 |
+
|
| 40 |
+
Create a CSV file with columns:
|
| 41 |
+
|
| 42 |
+
- `input` (Singlish / code-mixed input)
|
| 43 |
+
- `reference` (expected Sinhala output)
|
| 44 |
+
|
| 45 |
+
You can start from `eval_dataset_template.csv`.
|
| 46 |
+
|
| 47 |
+
### 2) Run evaluation
|
| 48 |
+
|
| 49 |
+
```bash
|
| 50 |
+
python evaluation.py --dataset eval_dataset_template.csv
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
Optional:
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
python evaluation.py --dataset your_dataset.csv --beam-width 5 --predictions-out eval_predictions.csv --diagnostics-out eval_diagnostics.json
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
### 3) Outputs
|
| 60 |
+
|
| 61 |
+
- `eval_predictions.csv`: per-sample prediction + metrics
|
| 62 |
+
- `eval_diagnostics.json`: per-word candidate scoring breakdown for error analysis
|
| 63 |
+
|
| 64 |
+
Reported aggregate metrics:
|
| 65 |
+
|
| 66 |
+
- Exact match
|
| 67 |
+
- Average Character Error Rate (CER)
|
| 68 |
+
- Average token accuracy
|
| 69 |
+
- Average English code-mix preservation
|
| 70 |
+
|
| 71 |
+
## 🤗 Hugging Face Spaces Notes
|
| 72 |
+
|
| 73 |
+
This project is compatible with Spaces. You can configure runtime paths with environment variables:
|
| 74 |
+
|
| 75 |
+
- `SICODE_DICTIONARY_PATH` (default: `dictionary.pkl`)
|
| 76 |
+
- `SICODE_MODEL_NAME` (default: `FacebookAI/xlm-roberta-base`)
|
| 77 |
+
- `SICODE_ENGLISH_CACHE` (optional path for `english_20k.txt` cache)
|
| 78 |
+
|
| 79 |
+
Example:
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
SICODE_DICTIONARY_PATH=dictionary.pkl
|
| 83 |
+
SICODE_MODEL_NAME=FacebookAI/xlm-roberta-base
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
The engine now auto-selects a writable cache path for English corpus downloads when running in restricted environments.
|
| 87 |
+
|
| 88 |
## 🏗️ System Architecture
|
| 89 |
|
| 90 |
This prototype utilizes a **Tiered Decoding Strategy**:
|