GaMS-Beta
/

GaMS-DPO-Translator

 language:
 - sl
 - en
+- hr
+- sr
+- bs
+base_model:
+- cjvt/GaMS-9B
+pipeline_tag: text-generation
 ---
+# Model Card for GaMS-DPO-Translator
+GaMS-DPO-Translator is a fine-tuned version of GaMS-9B-Instruct. Direct Preference Optimization (DPO) was performed on the original model. The learning dataset was synthetially generated by using GaMS-9B-Instruct and EuroLLM-9B-Instruct.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/652d40a78fa1fbb0aae165bb/94gX0PG8zRB_Zg31K2y_i.png)
+## Basic information
+- **Developed by:** team of researchers at the University of Ljubljana, Faculty for Computer and Information Science. Team members: Dario Vajda, Domen Vreš and Marko Robnik-Šikonja.
+- **Languages:** Slovene, English (primary), Croatian, Bosnian and Serbian (secondary). The model might also work for other languages supported by Gemma 2, even though it was not continually pretrained on them.
+- **Base model:** [cjvt/GaMS-9B](https://huggingface.co/cjvt/GaMS-9B-Instruct)
+- **License:** [Gemma](https://ai.google.dev/gemma/terms)
+## Usage
+The model can be run through `pipeline` API using the following code:
+```python
+from transformers import pipeline
+model_id = "DarioVajda/GaMS-DPO-Translator"
+pline = pipeline(
+    "text-generation",
+    model=model_id,
+    device_map="cuda" # replace with "mps" to run on a Mac device
+)
+# Example of response generation
+message = [{"role": "user", "content": "Prevedi naslednje angleško besedilo v slovenščino.\nToday is a nice day."}]
+response = pline(message, max_new_tokens=512)
+print("Translation:", response[0]["generated_text"][-1]["content"])
+```
+For multi GPU inference set the `device_map` to `auto`:
+```python
+from transformers import pipeline
+model_id = "DarioVajda/GaMS-DPO-Translator"
+pline = pipeline(
+    "text-generation",
+    model=model_id,
+    device_map="auto"
+)
+# Example of response generation
+message = [{"role": "user", "content": "Prevedi naslednje angleško besedilo v slovenščino.\nToday is a nice day."}]
+response = pline(message, max_new_tokens=512)
+print("Model's response:", response[0]["generated_text"][-1]["content"])
+# Example of conversation chain
+new_message = response[0]["generated_text"]
+new_message.append({"role": "user", "content": "Lahko bolj podrobno opišeš ta dogodek?"})
+response = pline(new_message, max_new_tokens=1024)
+print("Model's response:", response[0]["generated_text"][-1]["content"])
+```
+## Data
+Data for fine-tuning the original model was acquired by translating a large corpora of wikipedia articles by two models (GaMS-9B-Instruct and EuroLLM-9B-Instruct) which were then ranked by some automatic metrics for translation quality and reliability.
+## Training
+The model was trained on the [Vega HPC](https://izum.si/vega_slv/)
+## Evaluation
+The model was evaluated by Slobench and we expanded the evaluation to measure some other qualities of the model we care about.
+### Slobench evaluation:
+| Model                          | BERT score | BLEU (avg) | METEOR (avg) | CHRF (avg) | BLEU (corpus) | CHRF (corpus) |
+|--------------------------------|-----------:|-----------:|-------------:|-----------:|--------------:|--------------:|
+| EuroLLM-9B-Instruct            |     0.8741 |     0.2927 |       0.5792 |     0.6055 |        0.3273 |        0.6055 |
+| GaMS-27B-Instruct              |     0.8734 |     0.2866 |       0.5688 |     0.5986 |        0.3246 |        0.5986 |
+| **GaMS-9B-DPO-Translator**     | **0.8726** | **0.2810** |   **0.5663** | **0.5967** |    **0.3252** |    **0.5967** |
+| GaMS-9B-Instruct               |     0.8713 |     0.2773 |       0.5616 |     0.5928 |        0.3209 |        0.5928 |
+| GPT 4o-mini                    |     0.8690 |     0.2619 |       0.5456 |     0.5839 |        0.3021 |        0.5839 |
+### Wikipedia evaluation:
+This evaluation was performed on data which was not seen during training. We checked how often the model would make some fatal error and later compared the COMET scores.
+Error rates:
+| Model           | Language Error | Truncation Error | Combined |
+|-----------------|---------------:|-----------------:|---------:|
+| EuroLLM         |            1%  |             0.4% |     1.4% |
+| GaMS            |           9.5% |             3.5% |      13% |
+| **GaMS-DPO**    |       **0.6%** |         **0.2%** | **0.8%** |
+COMET scoring results:
+| Model                         | Average COMET score |
+|-------------------------------|--------------------:|
+| EuroLLM-9B-Instruct           |              0.755 |
+| GaMS-9B-Instruct              |              0.736 |
+| **GaMS-9B-DPO-Translator**    |              0.771 |