Chinese-to-English translation model trained with GRPO with MTQE rewards. Works well on idiom translation, non-idiomatic translation, and for other languages as well.

from vllm import LLM, SamplingParams
sampling_params = SamplingParams(temperature=0.3, max_tokens=512)
llm = LLM('ishikaa/Chinese_llama8b-da', tensor_parallel_size=torch.cuda.device_count(), gpu_memory_utilization=0.8, trust_remote_code=True)

idiom = "" # your Chinese idiom
prompt = f"Concisely translate the idiom {idiom} semantically into English: "
output = llm.generate(prompt, sampling_params=sampling_params)

print(output.outputs[0].text)

For more information, read here: https://www.arxiv.org/abs/2601.06307

Citation

@misc{agarwal2026risingtideliftsboats,
      title={A Rising Tide Lifts All Boats: MTQE Rewards for Idioms Improve General Translation Quality}, 
      author={Ishika Agarwal and Zhenlin He and Dhruva Patil and Dilek Hakkani-Tür},
      year={2026},
      eprint={2601.06307},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.06307}, 
}
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ishikaa/Chinese_llama8b-da

Finetuned
(1713)
this model

Collection including ishikaa/Chinese_llama8b-da

Paper for ishikaa/Chinese_llama8b-da