bnjmnmarie's picture
Create README.md
0929a31 verified
---
license: gemma
datasets:
- kaitchup/opus100-translategemma-calib
base_model:
- google/translategemma-4b-it
---
This is a quantized variant of **google/translategemma-4b-it**, created by **The Kaitchup** (newsletter: https://kaitchup.substack.com).
More details (training recipe, benchmarks, and recommended settings) will be added later. In the meantime, here are the current notes and a working inference example.
## Status / limitations
- **Quick smoke test only** (not fully evaluated).
- **RoPE parameters were removed** for compatibility with **vLLM**. As a result, **long-context behavior may be degraded**. I have not verified the impact yet.
- **Chat template not supported (for now).** To use the model in vLLM, call a **completions** endpoint and provide a **fully formatted prompt**.
## Serving with vLLM
```
vllm serve kaitchup/translategemma-4b-it-FP8-Dynamic --max-model-len 2048 --chat-template-content-format openai --served-model-name gemma
```
```
curl -s http://localhost:8000/v1/completions -H "Content-Type: application/json" -d '{
"model": "gemma",
"prompt": "<bos><start_of_turn>user\nYou are a professional French (fr) to English (en) translator. Your goal is to accurately convey the meaning and nuances of the original French text while adhering to English grammar, vocabulary, and cultural sensitivities.\nProduce only the English translation, without any additional explanations or commentary. Please translate the following French text into English:\n\n\nJaime les pâtes !<end_of_turn>\n<start_of_turn>model\n",
"temperature": 0,
"max_tokens": 200,
"stop": ["<end_of_turn>"]
}'
```