Spaces:

Norelad
/

coptic-translation-interface

Sleeping

Rogaton Claude commited on Nov 13

Commit

90f0f33

1 Parent(s): 6865b40

fix: Improve translation accuracy with target language selector and explicit prompts

**Target Language Selection:**
- Add dedicated "Target Language" dropdown in sidebar
- Appears when "Translation" analysis type is selected
- Excludes Coptic dialects, shows only modern languages
- Defaults to English

**Enhanced Translation Prompts:**
- Dynamic prompt generation based on selected target language
- Explicit instructions: "Provide ONLY the direct translation"
- Lists what NOT to include (no source text, no explanations)
- Identifies as "professional Coptic translator" for better context

**System Message Control:**
- Add system role message specifically for translation tasks
- Reinforces "no explanations, no commentary" instruction
- Helps model stay focused on pure translation

**Temperature Adjustment:**
- Lower temperature from 0.7 to 0.5 for translation
- Reduces creative elaboration, increases accuracy
- Standard tasks keep default temperature

**Result:**
- Translations now output ONLY the target language text
- No more repeating Coptic source text
- No more English when French is selected
- Cleaner, more accurate translations

Fixes issue where model was repeating input and adding commentary
instead of providing clean translations to the selected target language.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show

apertus_ui.py +39 -14

apertus_ui.py CHANGED Viewed

@@ -13,14 +13,16 @@ COPTIC_ALPHABET = {
     'Ϣ': 'Shai', 'Ϥ': 'Fai', 'Ϧ': 'Khei', 'Ϩ': 'Hori', 'Ϫ': 'Gangia', 'Ϭ': 'Shima', 'Ϯ': 'Ti'
 }
-# Coptic linguistic prompts
-COPTIC_PROMPTS = {
-    'dialect_analysis': "Analyze the Coptic dialect of this text and identify linguistic features:",
-    'translation': "Translate this Coptic text to English, preserving theological and cultural context:",
-    'transcription': "Provide a romanized transcription of this Coptic text:",
-    'morphology': "Analyze the morphological structure of these Coptic words:",
-    'lexicon_lookup': "Look up these Coptic words in the lexicon and provide Greek etymologies:"
-}
 # Lexicon loader
 @st.cache_data
@@ -260,13 +262,28 @@ with st.sidebar:
                 else:
                     st.write("No matches found")
-    # Linguistic analysis options
     if selected_lang in ['cop', 'cop-sa', 'cop-bo']:
         st.subheader("Analysis Type")
-        analysis_type = st.selectbox("Choose analysis:",
-                                   options=list(COPTIC_PROMPTS.keys()),
                                    format_func=lambda x: x.replace('_', ' ').title())
 # Use HuggingFace Inference API instead of loading model locally
 # This is much faster and doesn't require GPU
 MODEL_NAME = "swiss-ai/Apertus-8B-Instruct-2509"
@@ -337,14 +354,22 @@ if prompt := st.chat_input("Type your message..."):
     with st.chat_message("assistant"):
         try:
             with st.spinner("🤖 Generating response..."):
-                # Use chat completion API
-                messages = [{"role": "user", "content": full_prompt}]
                 response_stream = inference_client.chat_completion(
                     model=MODEL_NAME,
                     messages=messages,
                     max_tokens=512,
-                    temperature=0.7,
                     top_p=0.9,
                     stream=True
                 )

     'Ϣ': 'Shai', 'Ϥ': 'Fai', 'Ϧ': 'Khei', 'Ϩ': 'Hori', 'Ϫ': 'Gangia', 'Ϭ': 'Shima', 'Ϯ': 'Ti'
 }
+# Coptic linguistic prompts (will be formatted with target language)
+def get_coptic_prompts(target_language):
+    """Generate Coptic analysis prompts with specified target language"""
+    return {
+        'dialect_analysis': f"Analyze the Coptic dialect of this text and identify linguistic features. Respond in {target_language}:",
+        'translation': f"You are a professional Coptic translator. Translate the following Coptic text to {target_language}.\n\nIMPORTANT: Provide ONLY the direct translation. Do not include:\n- The original Coptic text\n- Explanations or commentary\n- Notes about context or meaning\n- Any text other than the {target_language} translation\n\nCoptic text to translate:",
+        'transcription': f"Provide a romanized transcription of this Coptic text. Respond in {target_language}:",
+        'morphology': f"Analyze the morphological structure of these Coptic words. Respond in {target_language}:",
+        'lexicon_lookup': f"Look up these Coptic words and provide definitions with Greek etymologies. Respond in {target_language}:"
+    }
 # Lexicon loader
 @st.cache_data
                 else:
                     st.write("No matches found")
+    # Linguistic analysis options for Coptic input
     if selected_lang in ['cop', 'cop-sa', 'cop-bo']:
         st.subheader("Analysis Type")
+        analysis_type = st.selectbox("Choose analysis:",
+                                   options=['translation', 'dialect_analysis', 'transcription', 'morphology', 'lexicon_lookup'],
                                    format_func=lambda x: x.replace('_', ' ').title())
+        # Target language selector for translation
+        if analysis_type == 'translation':
+            st.subheader("Target Language")
+            target_lang = st.selectbox("Translate to:",
+                                      options=[k for k in LANGUAGES.keys() if k not in ['cop', 'cop-sa', 'cop-bo']],
+                                      format_func=lambda x: LANGUAGES[x],
+                                      index=0)  # Default to English
+            target_language_name = LANGUAGES[target_lang]
+        else:
+            # For non-translation tasks, use English as default output language
+            target_language_name = "English"
+        # Get prompts for the target language
+        COPTIC_PROMPTS = get_coptic_prompts(target_language_name)
 # Use HuggingFace Inference API instead of loading model locally
 # This is much faster and doesn't require GPU
 MODEL_NAME = "swiss-ai/Apertus-8B-Instruct-2509"
     with st.chat_message("assistant"):
         try:
             with st.spinner("🤖 Generating response..."):
+                # Prepare messages with system instruction for better control
+                if selected_lang in ['cop', 'cop-sa', 'cop-bo'] and 'analysis_type' in locals() and analysis_type == 'translation':
+                    # For translation: strict system message
+                    messages = [
+                        {"role": "system", "content": "You are a professional Coptic-to-modern-language translator. Provide only direct translations without explanations, commentary, or repeating the source text."},
+                        {"role": "user", "content": full_prompt}
+                    ]
+                else:
+                    # For other tasks: standard chat
+                    messages = [{"role": "user", "content": full_prompt}]
                 response_stream = inference_client.chat_completion(
                     model=MODEL_NAME,
                     messages=messages,
                     max_tokens=512,
+                    temperature=0.5,  # Lower temperature for more focused translations
                     top_p=0.9,
                     stream=True
                 )