SRP-base-model-training
/

eval

Model card Files Files and versions

aimabai commited on Jul 29, 2025

Commit

0fd78cb

·

verified ·

1 Parent(s): 9f9a082

Create README.md

Files changed (1) hide show

README.md +109 -0

README.md ADDED Viewed

	@@ -0,0 +1,109 @@

+# Bilingual Translation Evaluation Script (EN → KK)
+This repository provides an evaluation pipeline for English-to-Kazakh/Russian-to-Kazakh (and vice versa) translation models based on the `Gemma3ForCausalLM` architecture from Hugging Face Transformers.
+## 🚀 Overview
+The script:
+- Loads a fine-tuned model and tokenizer
+- Performs inference on a FLORES-style test set (`.jsonl`)
+- Computes BLEU score using NLTK
+- Saves predictions and evaluation results into a JSON file
+## 📁 File Structure
+```
+.
+├── eval_sync_KKEN.py   # Main evaluation script (this file)
+├── eval_sync_KKEN_data_en_to_kk.json  # Output file (generated)
+```
+## 📥 Input Format
+The test file should be a `.jsonl` file where each line is a JSON object with the following fields:
+```json
+{
+  "system": "System prompt text",
+  "user": "<src=en><tgt=kk> Some English input",
+  "assistant": "Expected Kazakh translation"
+}
+```
+## 📤 Output
+The script will produce a file named like `eval_sync_KKEN_data_en_to_kk.json`, which contains:
+- Model path
+- Final BLEU score
+- A list of examples with system prompt, cleaned user input, model prediction (hypothesis), and reference translation
+Example output entry:
+```json
+{
+  "model": "/path/to/model",
+  "bleu": 27.53,
+  "examples": [
+    {
+      "system": "Translate this.",
+      "user": "Hello, how are you?",
+      "reference": "Сәлем, қалайсың?",
+      "hypothesis": "Сәлеметсіз бе, жағдайыңыз қалай?"
+    }
+  ]
+}
+```
+## ⚙️ Configuration
+Modify these lines at the top of the script as needed:
+```python
+SRC_LANG = "en"
+TGT_LANG = "kk"
+MODEL_PATH = "/path/to/your/model"
+TEST_FILE = "/path/to/test_file.jsonl"
+MAX_NEW_TOKS = 64
+DEVICE = "cuda"  # or "cpu"
+```
+To specify GPU devices:
+```bash
+export CUDA_VISIBLE_DEVICES=2,3,4,5
+```
+## 📦 Requirements
+Install required packages:
+```bash
+pip install transformers torch nltk tqdm
+```
+Also, download NLTK data (if not yet):
+```python
+import nltk
+nltk.download('punkt')
+```
+## ▶️ Run the Script
+```bash
+python eval_sync_KKEN.py
+```
+This will:
+- Load the model
+- Run translation inference
+- Compute BLEU score
+- Save evaluation results to a `.json` file
+## 📝 Notes
+- Make sure your model and tokenizer directory follows Hugging Face format.
+- The script uses `<start_of_turn>` and `<end_of_turn>` tokens to structure prompts for inference.
+- Input strings are automatically cleaned of tags like `<src=..><tgt=..>` before generating output.
+## 📧 Contact
+For questions or feedback, please contact [Your Name or GitHub Profile].