metadata
library_name: transformers
language:
- en
- de
base_model:
- Unbabel/TowerInstruct-7B-v0.2
tags:
- Machine Translation
model-index:
- name: iwslt_mt_ende
results: []
paper:
title: >-
KIT's Offline Speech Translation and Instruction Following Submission
for IWSLT 2025
authors: >-
Koneru, Sai and Z{"u}fle, Maike and Nguyen, Thai-Binh and Akti, Seymanur
and Niehues, Jan and Waibel, Alexander
url: https://arxiv.org/abs/2505.13036
published: 2025-05-25T00:00:00.000Z
KIT IWSLT25 Machine Translation Model
Adapted TowerInstruct 7B v0.2 for English-German translations. We filter the IWSLT data using quality estimation models and train on high quality data optimizing for the specific language pair. We find it to be better than the base model especially for speech domain.
Model Usage
The usage is same to the base model. However, we only tried for English-German translation and do not know the performance of the model on other languages and translation tasks.
Model Loading
model_id = "Unbabel/TowerInstruct-7B-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.padding_side="left"
padding="longest"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
model.load_adapter("skoneru/iwslt_mt_ende")
Prompt Format
<|im_start|>user\nTranslate the sentence from English into German.
English:
{src_sentence}
German:<|im_end|>\n<|im_start|>assistant
{llm to generate}
Model Inference
After loading the model and the tokenizer, you can simply use the model with the prompt format as shown below:
src_sent = "Welcome to the first lecture"
prefix = "<|im_start|>user\nTranslate the sentence from English into German.\nEnglish: "
suffix = "\nGerman:<|im_end|>\n<|im_start|>assistant\n"
prompt = [prefix + src_sent + suffix]
inputs = tokenizer(prompt, return_tensors="pt", padding=True, add_special_tokens=False).to(model.device)
num_beams=5
output = model.generate(**inputs, num_beams=num_beams, max_new_tokens=256, return_dict_in_generate=True, early_stopping=True, do_sample=False)
hyps = tokenizer.batch_decode(output.sequences[:,inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(hyps)
๐ Citation
If you use this model in your research, please cite:
@inproceedings{koneru2025kit,
title={KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025},
author={Koneru, Sai and Z{\"u}fle, Maike and Nguyen, Thai-Binh and Akti, Seymanur and Niehues, Jan and Waibel, Alexander},
journal={arXiv preprint arXiv:2505.13036},
year={2025},
url={https://arxiv.org/abs/2505.13036}
}