File size: 905 Bytes
0fd78cb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6c6c4e
0fd78cb
 
 
 
 
 
 
 
 
 
 
 
03727c0
0fd78cb
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Bilingual Translation Evaluation Script (EN → KK)

This repository provides an evaluation pipeline for English-to-Kazakh/Russian-to-Kazakh (and vice versa) translation models based on the `Gemma3ForCausalLM` architecture from Hugging Face Transformers.

## 🚀 Overview

The script:
- Loads a fine-tuned model and tokenizer
- Performs inference on a FLORES-style test set (`.jsonl`)
- Computes BLEU score using NLTK
- Saves predictions and evaluation results into a JSON file

## ⚙️ Configuration

Modify these lines at the top of the script as needed:

```python
SRC_LANG = "en"
TGT_LANG = "kk"
MODEL_PATH = "/path/to/your/model"
TEST_FILE = "/path/to/test_file.jsonl"
OUTPUT_JSON = "/path/to/output_file.jsonl"
MAX_NEW_TOKS = 64
DEVICE = "cuda"  # or "cpu"
```

To specify GPU devices:
```bash
export CUDA_VISIBLE_DEVICES=2,3,4,5
```

## ▶️ Run the Script

```bash
python eval_blue.py
```