Spaces:

lenzcom
/

Email

Running

App Files Files Community

Email / examples /03_translation /CODE.md

lenzcom

Upload folder using huggingface_hub

e706de2 verified 2 days ago

preview code

raw

history blame contribute delete

7.65 kB

	# Code Explanation: translation.js

	This file demonstrates how to use system prompts to specialize an AI agent for a specific task - in this case, professional German translation.

	## Step-by-Step Code Breakdown

	### 1. Import Required Modules
	```javascript
	import {
	getLlama, LlamaChatSession,
	} from "node-llama-cpp";
	import {fileURLToPath} from "url";
	import path from "path";
	```
	- Imports are the same as the intro example

	### 2. Initialize and Load Model
	```javascript
	const __dirname = path.dirname(fileURLToPath(import.meta.url));

	const llama = await getLlama();
	const model = await llama.loadModel({
	modelPath: path.join(
	__dirname,
	"../",
	"models",
	"hf_giladgd_Apertus-8B-Instruct-2509.Q6_K.gguf"
	)
	});
	```

	#### Why Apertus-8B?
	Apertus-8B is a multilingual language model specifically trained to support over 1,000 languages, with 40% of its training data in non-English languages. This makes it an excellent choice for translation tasks because:

	1. Massive Multilingual Coverage: The model was trained on 15 trillion tokens across 1,811 natively supported languages, including underrepresented languages like Swiss German and Romansh
	2. Larger Size: With 8 billion parameters, it's larger than the intro.js example, providing better understanding and output quality
	3. Translation-Focused Training: The model was explicitly designed for applications including translation systems
	4. Q6_K Quantization: 6-bit quantization provides a good balance between quality and file size

	Experiment suggestion: Try swapping this model with others to compare translation quality! For example:
	- Use a smaller 3B model to see how size affects translation accuracy
	- Use a monolingual model to demonstrate why multilingual training matters
	- Use a general-purpose model without translation-specific training

	Read more about Apertus [arXiv](https://arxiv.org/abs/2509.14233)

	### 3. Create Context and Chat Session with System Prompt
	```javascript
	const context = await model.createContext();
	const session = new LlamaChatSession({
	contextSequence: context.getSequence(),
	systemPrompt: `Du bist ein erfahrener wissenschaftlicher Übersetzer...`
	});
	```

	Key difference from intro.js: The systemPrompt!

	#### What is a System Prompt?
	The system prompt defines the agent's role, behavior, and rules. It's like giving the AI a job description:

	```
	┌─────────────────────────────────────┐
	│ System Prompt │
	│ "You are a professional translator"│
	│ + Detailed instructions │
	│ + Rules to follow │
	└─────────────────────────────────────┘
	↓
	Affects every response
	```

	### 4. The System Prompt Breakdown

	The system prompt (in German) tells the model:

	Role:
	```
	"Du bist ein erfahrener wissenschaftlicher Übersetzer für technische Texte
	aus dem Englischen ins Deutsche."
	```
	Translation: "You are an experienced scientific translator for technical texts from English to German."

	Task:
	```
	"Deine Aufgabe: Erstelle eine inhaltlich exakte Übersetzung..."
	```
	Translation: "Your task: Create a content-accurate translation that maintains full meaning and technical precision."

	Rules (Lines 33-41):
	1. Preserve every technical statement exactly
	2. Use idiomatic, fluent German
	3. Avoid literal sentence structures
	4. Use correct terminology (e.g., "Multi-Agenten-System")
	5. Use German typography for numbers (e.g., "54 %")
	6. Adapt compound terms to German grammar
	7. Shorten overly complex sentences while preserving meaning
	8. Use neutral, scientific style

	Critical Instruction (Line 48):
	```
	"DO NOT add any addition text or explanation. ONLY respond with the translated text"
	```
	- Forces the model to return ONLY the translation
	- No "Here's the translation:" prefix
	- No explanations or commentary

	### 5. The Translation Query
	```javascript
	const q1 = `Translate this text into german:

	We address the long-horizon gap in large language model (LLM) agents by en-
	abling them to sustain coherent strategies in adversarial, stochastic environments.
	...
	`;
	```
	- Contains a scientific abstract about LLM agents (HexMachina paper)
	- Complex technical content with specialized terms
	- Tests the model's ability to:
	- Understand technical AI/ML concepts
	- Translate accurately
	- Follow the detailed system prompt rules

	### 6. Execute Translation
	```javascript
	const a1 = await session.prompt(q1);
	console.log("AI: " + a1);
	```
	- Sends the translation request to the model
	- The model will:
	1. Read the system prompt (its "role")
	2. Read the user's request
	3. Apply all the rules from the system prompt
	4. Generate a German translation

	### 7. Cleanup
	```javascript
	session.dispose()
	context.dispose()
	model.dispose()
	llama.dispose()
	```
	- Same cleanup as intro.js
	- Always dispose resources when done

	## Key Concepts Demonstrated

	### 1. System Prompts for Specialization
	System prompts transform a general-purpose LLM into a specialized agent:

	```
	General LLM + System Prompt = Specialized Agent
	(Translator, Coder, Analyst, etc.)
	```

	### 2. Detailed Instructions Matter
	Compare these approaches:

	❌ Minimal approach:
	```javascript
	systemPrompt: "Translate to German"
	```

	✅ This example (detailed):
	```javascript
	systemPrompt: `
	You are a professional translator
	Follow these rules:
	- Rule 1
	- Rule 2
	- Rule 3
	...
	`
	```

	The detailed approach gives much better, more consistent results.

	### 3. Constraining Output Format
	The line "DO NOT add any addition text" demonstrates output control:

	Without constraint:
	```
	AI: Here's the translation of the text you provided:

	[German text]

	I hope this helps! Let me know if you need anything else.
	```

	With constraint:
	```
	AI: [German text only]
	```

	## What Makes This an "Agent"?

	This is a specialized agent because:

	1. Specific Role: Has a defined purpose (translation)
	2. Constrained Behavior: Follows specific rules and guidelines
	3. Consistent Output: Produces predictable, formatted results
	4. Domain Expertise: Optimized for scientific/technical content

	## Expected Output

	When run, you'll see a German translation of the English abstract, following all the rules:
	- Proper German scientific style
	- Correct technical terminology
	- German number formatting (e.g., "54 %")
	- No extra commentary

	The quality depends on the model's training and size.

	## Experimentation Ideas

	1. Try different models:
	- Swap Apertus-8B with a smaller model (3B) to see size impact
	- Try a monolingual English model to demonstrate the importance of multilingual training
	- Use models with different quantization levels (Q4, Q6, Q8) to compare quality vs. size

	2. Modify the system prompt:
	- Remove specific rules one by one to see their impact
	- Change the translation target language
	- Adjust the style (formal vs. casual)

	3. Test with different content:
	- Technical documentation
	- Creative writing
	- Business communications
	- Simple vs. complex sentences

	Each experiment will help you understand how system prompts, model selection, and prompt engineering work together to create effective AI agents.