TranslateGemma Technical Report
Paper
• 2601.09012 • Published
• 20
q4: --precision int4 --execution_provider webgpu --extra_options int4_accuracy_level=1 int4_block_size=32 int4_algo_config=rtn enable_webgpu_graph=trueq4f16: --precision int4 --execution_provider webgpu --extra_options int4_accuracy_level=2 int4_block_size=64 int4_algo_config=rtn enable_webgpu_graph=true
Using int4_accuracy_level=4 or int4_block_size=128 will cause a significant drop in accuracy, even leading to nonsensical output.
This repo contains a converted Gemma3ForCausalLM checkpoint extracted from the language component of the original multimodal model:
google/translategemma-4b-itGemma3ForCausalLM (text-only)float16 weightsgoogle/translategemma-4b-it page, similar to google/gemma-3-1b-it)
If you encounter dialogue problems, try not using "apply_chat_template" directly passing messages.
ex:const result = await generator(messages, {....
try {
if (!generator) {
// Check WebGPU
if (!('gpu' in navigator)) {
throw new Error("Worker: WebGPU is not supported in this environment.");
}
generator = await pipeline('text-generation', MODEL_ID, {
device: 'webgpu',
dtype: 'q4f16', //q4
progress_callback: (info: any) => {
self.postMessage({
type: 'progress',
data: info
});
}
}) as unknown as TextGenerationPipeline;
}
self.postMessage({ type: 'ready' });
console.log('Worker: Model initialized successfully');
} catch (error: any) {
console.error('Worker: Initialization failed', error);
self.postMessage({
type: 'error',
error: error.message || 'Unknown error during initialization'
});
}
}
try {
console.log('Worker: Start translation...');
const sourceName = "jp-ja";
const targetName = "zh-tw";
const messages = [
{
role: "system",
content: `You are an expert multilingual translator. Your task is to accurately and fluently translate text from ${sourceName} into ${targetName}. Provide ONLY the translated text, without any additional explanations, commentary, or greetings.`
},
{
role: "user",
content: `${text}.`
//content: `Translate the following to ${targetName}:\n\n${text}.` If there are multiple paragraphs or long texts, use this "content."
}
];
const prompt = generator.tokenizer.apply_chat_template(messages, {
tokenize: false,
add_generation_prompt: true,
}) as string;
console.log("Check prompt:", prompt)
let buffer = "";
const streamer = new TextStreamer(generator.tokenizer, {
skip_prompt: true,
skip_special_tokens: true,
callback_function: (output: string) => {
buffer += output;
if (buffer.length >= 3 || output.includes('。') || output.includes('\n')) {
self.postMessage({
type: 'update',
output: buffer
});
buffer = "";
}
}
});
const result = await generator(prompt, {
max_new_tokens: 1024,
temperature: 0.6,
repetition_penalty: 1.1,
do_sample: false,
return_full_text: false,
streamer: streamer,
// Gemma 3 EOS ID 7
//eos_token_id: [1, 106],
//stop_strings: ["<end_of_turn>", "<eos>"],
});
self.postMessage({ type: 'complete', result });
console.log("Print result: ", result);
console.log('Worker: Translation complete');
} catch (error: any) {
console.error('Worker: Translation failed', error);
self.postMessage({
type: 'error',
error: error.message || 'Error during translation generation'
});
}
Base model
google/translategemma-4b-it