Spaces:

haaaaus
/

newbiew

Sleeping

App Files Files Community

haaaaus commited on Dec 6, 2025

Commit

ef0641f

verified ·

1 Parent(s): fc778e6

Upload 83 files

Browse files

Files changed (42) hide show

.gitattributes +13 -0
README.md +61 -16
__pycache__/app.cpython-311.pyc +0 -0
__pycache__/font_analyzer.cpython-311.pyc +0 -0
__pycache__/process_bubble.cpython-311.pyc +0 -0
app.py +491 -36
font_analyzer.py +151 -0
fonts/Yuki-Arenzi.ttf +0 -0
fonts/Yuki-Burobu.ttf +3 -0
fonts/Yuki-CCMarianChurchlandJournal.ttf +3 -0
fonts/Yuki-CDX Starstreak.ttf +3 -0
fonts/Yuki-CHICKEN Pie.ttf +3 -0
fonts/Yuki-CrashLanding BB.ttf +0 -0
fonts/Yuki-Downhill Dive.ttf +3 -0
fonts/Yuki-Gingerline DEMO Regular.ttf +0 -0
fonts/Yuki-Gorrilaz_Story.ttf +3 -0
fonts/Yuki-KG Only Angel.ttf +3 -0
fonts/Yuki-LF SwandsHand.ttf +0 -0
fonts/Yuki-La Belle Aurore.ttf +0 -0
fonts/Yuki-Little Cupcakes.ttf +3 -0
fonts/Yuki-Nagurigaki Crayon.ttf +3 -0
fonts/Yuki-Ripsnort BB.ttf +3 -0
fonts/Yuki-Roasthink.ttf +0 -0
fonts/Yuki-Screwball.ttf +0 -0
fonts/Yuki-Shark Crash.ttf +3 -0
fonts/Yuki-Skulduggery.ttf +3 -0
fonts/Yuki-Superscratchy.ttf +0 -0
fonts/Yuki-Tea And Oranges Regular.ttf +3 -0
ocr/__pycache__/chrome_lens_ocr.cpython-311.pyc +0 -0
ocr/chrome_lens_ocr.py +58 -0
process_bubble.py +12 -1
static/css/style.css +50 -0
static/js/app.js +55 -0
templates/index.html +148 -0
templates/translate.html +20 -7
translator/__pycache__/__init__.cpython-311.pyc +0 -0
translator/__pycache__/copilot_translator.cpython-311.pyc +0 -0
translator/__pycache__/gemini_translator.cpython-311.pyc +0 -0
translator/__pycache__/translator.cpython-311.pyc +0 -0
translator/copilot_translator.py +351 -0
translator/gemini_translator.py +136 -50
translator/translator.py +3 -1

.gitattributes CHANGED Viewed

@@ -47,3 +47,16 @@ examples/ex3.png filter=lfs diff=lfs merge=lfs -text
 fonts/ariali.ttf filter=lfs diff=lfs merge=lfs -text
 static/img/loading.gif filter=lfs diff=lfs merge=lfs -text
 static/img/back.jpg filter=lfs diff=lfs merge=lfs -text

 fonts/ariali.ttf filter=lfs diff=lfs merge=lfs -text
 static/img/loading.gif filter=lfs diff=lfs merge=lfs -text
 static/img/back.jpg filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Burobu.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-CCMarianChurchlandJournal.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-CDX[[:space:]]Starstreak.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-CHICKEN[[:space:]]Pie.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Downhill[[:space:]]Dive.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Gorrilaz_Story.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-KG[[:space:]]Only[[:space:]]Angel.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Little[[:space:]]Cupcakes.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Nagurigaki[[:space:]]Crayon.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Ripsnort[[:space:]]BB.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Shark[[:space:]]Crash.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Skulduggery.ttf filter=lfs diff=lfs merge=lfs -text
+fonts/Yuki-Tea[[:space:]]And[[:space:]]Oranges[[:space:]]Regular.ttf filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -7,24 +7,69 @@ sdk: docker
 pinned: false
 license: mit
 ---
-# Manga Translator
-Translate manga/webtoon speech bubbles automatically!
-## Features
-- 🔍 YOLO-based bubble detection
-- 📝 Multiple OCR engines (Manga-OCR, Chrome Lens)
-- 🌐 Multiple translators (Google, Gemini, Bing, Baidu, NLLB)
-- 📏 Smart handling for long webtoon images
-- 🎨 Custom fonts support
-## Usage
-1. Upload manga/webtoon images
-2. Select source and target languages
-3. Choose translator and OCR engine
-4. Click Translate!
-## Supported Languages
-- Source: Japanese, Chinese, Korean, English
-- Target: Vietnamese, English, Chinese, Korean, Thai, Indonesian, French, German, Spanish, Russian

 pinned: false
 license: mit
 ---
+# Manga Translator 📚
+Dịch tự động speech bubbles trong manga/manhwa/manhua!
+## ✨ Features
+### Core
+- 🔍 **YOLO-based bubble detection** - Phát hiện speech bubble tự động
+- 📝 **Multiple OCR engines** - Manga-OCR, Chrome Lens (batch support)
+- 🌐 **Multiple translators** - Gemini, Copilot API, NLLB, Opus-MT
+### Translation
+- 🧠 **Context Memory** - Sử dụng context từ tất cả ảnh để dịch chính xác hơn
+- 🎯 **Multi-page batch translation** - Dịch 10 pages/API call tiết kiệm quota
+- 🎨 **Translation styles** - Default, Casual, Formal, Keep Honorifics, Web Novel...
+### UI/UX
+- 📊 **Real-time progress** - Progress bar hiển thị tiến độ theo từng phase
+- 📦 **Download ZIP** - Tải tất cả ảnh đã dịch dưới dạng ZIP
+- 🔤 **Auto font sizing** - Tự động điều chỉnh cỡ chữ theo bubble
+- 📏 **24+ fonts** - Yuki fonts, AnimeAce, và nhiều font khác
+## 🚀 Usage
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Run
+python app.py
+```
+Mở http://localhost:5000
+## 📋 Workflow
+1. Upload manga/manhwa images
+2. Chọn ngôn ngữ gốc (Japanese/Chinese/Korean/English)
+3. Chọn ngôn ngữ đích (Vietnamese, English, ...)
+4. Chọn translator (Gemini/Copilot) và OCR engine
+5. Check "Context Memory" để dịch chính xác hơn
+6. Click **Translate**!
+7. Xem progress bar real-time
+8. Download từng ảnh hoặc **Download ZIP**
+## 🌍 Supported Languages
+| Source | Target |
+|--------|--------|
+| Japanese (Manga) | Vietnamese |
+| Chinese (Manhua) | English |
+| Korean (Manhwa) | Chinese |
+| English (Comic) | Korean, Thai, Indonesian, French, German, Spanish, Russian |
+## 📡 API Keys
+- **Gemini**: Nhập API key từ [ai.google.dev](https://ai.google.dev)
+- **Copilot**: Chạy server [copilot-api](https://github.com/copilot-api) local
+## 🔧 Tech Stack
+- Flask + Flask-SocketIO (real-time WebSocket)
+- YOLOv8 (bubble detection)
+- Manga-OCR / Chrome-Lens (OCR)
+- Gemini / Copilot API (translation)
+- PIL (text rendering)

__pycache__/app.cpython-311.pyc CHANGED Viewed

Binary files a/__pycache__/app.cpython-311.pyc and b/__pycache__/app.cpython-311.pyc differ

__pycache__/font_analyzer.cpython-311.pyc ADDED Viewed

Binary file (9.07 kB). View file

__pycache__/process_bubble.cpython-311.pyc CHANGED Viewed

Binary files a/__pycache__/process_bubble.cpython-311.pyc and b/__pycache__/process_bubble.cpython-311.pyc differ

app.py CHANGED Viewed

@@ -1,4 +1,8 @@
-from flask import Flask, render_template, request, redirect
 from detect_bubbles import detect_bubbles
 from process_bubble import process_bubble
 from translator.translator import MangaTranslator
@@ -14,7 +18,10 @@ import os
 app = Flask(__name__)
 app.config["SECRET_KEY"] = os.environ.get("SECRET_KEY", "secret_key")
-app.config["MAX_CONTENT_LENGTH"] = 50 * 1024 * 1024  # 50MB max upload
 MODEL_PATH = "model/model.pt"
@@ -24,10 +31,11 @@ def home():
     return render_template("index.html")
-def process_single_image(image, manga_translator, mocr, selected_translator, selected_font):
     """Process a single image and return the translated version.
     Optimized with batch translation for Gemini to reduce API calls.
     """
     results = detect_bubbles(MODEL_PATH, image)
@@ -37,11 +45,16 @@ def process_single_image(image, manga_translator, mocr, selected_translator, sel
     # Phase 1: Collect all bubble data and OCR texts
     bubble_data = []
     texts_to_translate = []
     for result in results:
         x1, y1, x2, y2, score, class_id = result
         detected_image = image[int(y1):int(y2), int(x1):int(x2)]
         # Fix: detected_image is already uint8, no need to multiply by 255
         im = Image.fromarray(detected_image)
         text = mocr(im)
@@ -55,13 +68,19 @@ def process_single_image(image, manga_translator, mocr, selected_translator, sel
         })
         texts_to_translate.append(text)
-    # Phase 2: Batch translate (especially efficient for Gemini)
     if selected_translator == "gemini" and len(texts_to_translate) > 1:
         # Use batch translation for Gemini
         try:
             if manga_translator._gemini_translator is None:
                 from translator.gemini_translator import GeminiTranslator
-                api_key = manga_translator.gemini_api_key or "AIzaSyAplFKOKBEcQku5m6gPEBMlZMGc4sI5rgo"
                 custom_prompt = getattr(manga_translator, '_gemini_custom_prompt', None)
                 manga_translator._gemini_translator = GeminiTranslator(
                     api_key=api_key,
@@ -76,35 +95,309 @@ def process_single_image(image, manga_translator, mocr, selected_translator, sel
         except Exception as e:
             print(f"Batch translation failed, falling back to single: {e}")
             translated_texts = [manga_translator.translate(t, method=selected_translator) for t in texts_to_translate]
     else:
         # Single translation for other translators
         translated_texts = [manga_translator.translate(t, method=selected_translator) for t in texts_to_translate]
     # Phase 3: Add translated text to bubbles
-    font_path = f"fonts/{selected_font}i.ttf"
     for data, translated_text in zip(bubble_data, translated_texts):
         add_text(data['detected_image'], translated_text, font_path, data['contour'])
     return image
 @app.route("/translate", methods=["POST"])
 def upload_file():
     # Get translator selection
     translator_map = {
         "Opus-mt model": "hf",
         "NLLB": "nllb",
-        "Gemini": "gemini"
     }
     selected_translator = translator_map.get(
         request.form["selected_translator"],
         request.form["selected_translator"].lower()
     )
     # Get font selection
-    selected_font = request.form["selected_font"].lower()
-    if selected_font == "animeace":
-        selected_font += "_"
     # Get OCR engine
     selected_ocr = request.form.get("selected_ocr", "chrome-lens").lower()
@@ -167,46 +460,170 @@ def upload_file():
     if selected_translator == "gemini" and style:
         manga_translator._gemini_custom_prompt = style
     if selected_ocr == "chrome-lens":
         mocr = ChromeLensOCR()
     else:
         mocr = MangaOcr()
     # Process all images
     processed_images = []
-    for file in files:
-        if file and file.filename:
             try:
-                # Read image
-                file_stream = file.stream
-                file_bytes = np.frombuffer(file_stream.read(), dtype=np.uint8)
-                image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
-                if image is None:
-                    continue
-                # Get original filename
-                name = os.path.splitext(file.filename)[0]
-                # Process image
-                processed_image = process_single_image(
-                    image, manga_translator, mocr,
-                    selected_translator, selected_font
                 )
-                # Encode to base64 (JPEG is 5-10x faster than PNG)
-                _, buffer = cv2.imencode(".jpg", processed_image, [cv2.IMWRITE_JPEG_QUALITY, 95])
                 encoded_image = base64.b64encode(buffer.tobytes()).decode("utf-8")
                 processed_images.append({
-                    "name": name,
                     "data": encoded_image
                 })
             except Exception as e:
-                print(f"Error processing {file.filename}: {e}")
-                continue
     if not processed_images:
         return redirect("/")
@@ -214,5 +631,43 @@ def upload_file():
     return render_template("translate.html", images=processed_images)
 if __name__ == "__main__":
-    app.run(debug=True)

+from flask import Flask, render_template, request, redirect, send_file, jsonify
+from flask_socketio import SocketIO, emit
+import io
+import zipfile
+import json
 from detect_bubbles import detect_bubbles
 from process_bubble import process_bubble
 from translator.translator import MangaTranslator
 app = Flask(__name__)
 app.config["SECRET_KEY"] = os.environ.get("SECRET_KEY", "secret_key")
+# No upload size limit (removed MAX_CONTENT_LENGTH restriction)
+# Initialize SocketIO for real-time progress updates
+socketio = SocketIO(app, cors_allowed_origins="*", async_mode='threading')
 MODEL_PATH = "model/model.pt"
     return render_template("index.html")
+def process_single_image(image, manga_translator, mocr, selected_translator, selected_font, font_analyzer=None):
     """Process a single image and return the translated version.
     Optimized with batch translation for Gemini to reduce API calls.
+    Supports auto font matching when font_analyzer is provided and selected_font is 'auto'.
     """
     results = detect_bubbles(MODEL_PATH, image)
     # Phase 1: Collect all bubble data and OCR texts
     bubble_data = []
     texts_to_translate = []
+    first_bubble_image = None  # For font analysis
     for result in results:
         x1, y1, x2, y2, score, class_id = result
         detected_image = image[int(y1):int(y2), int(x1):int(x2)]
+        # Save first bubble for font analysis (before processing)
+        if first_bubble_image is None:
+            first_bubble_image = detected_image.copy()
         # Fix: detected_image is already uint8, no need to multiply by 255
         im = Image.fromarray(detected_image)
         text = mocr(im)
         })
         texts_to_translate.append(text)
+    # Auto font matching: analyze first bubble and select best font
+    # Note: font is now determined BEFORE processing, passed as selected_font
+    # (Analysis moved to upload_file to only run once per batch)
+    # Phase 2: Batch translate
     if selected_translator == "gemini" and len(texts_to_translate) > 1:
         # Use batch translation for Gemini
         try:
             if manga_translator._gemini_translator is None:
                 from translator.gemini_translator import GeminiTranslator
+                api_key = getattr(manga_translator, '_gemini_api_key', None)
+                if not api_key:
+                    raise ValueError("Gemini API key not provided")
                 custom_prompt = getattr(manga_translator, '_gemini_custom_prompt', None)
                 manga_translator._gemini_translator = GeminiTranslator(
                     api_key=api_key,
         except Exception as e:
             print(f"Batch translation failed, falling back to single: {e}")
             translated_texts = [manga_translator.translate(t, method=selected_translator) for t in texts_to_translate]
+    elif selected_translator == "copilot" and len(texts_to_translate) > 1:
+        # Use batch translation for Copilot
+        try:
+            if not hasattr(manga_translator, '_copilot_translator') or manga_translator._copilot_translator is None:
+                from translator.copilot_translator import CopilotTranslator
+                copilot_server = getattr(manga_translator, '_copilot_server', 'http://localhost:8080')
+                copilot_model = getattr(manga_translator, '_copilot_model', 'gpt-4o')
+                manga_translator._copilot_translator = CopilotTranslator(
+                    server_url=copilot_server,
+                    model=copilot_model
+                )
+                print(f"Copilot translator initialized: {copilot_server} / {copilot_model}")
+            translated_texts = manga_translator._copilot_translator.translate_batch(
+                texts_to_translate,
+                source=manga_translator.source,
+                target=manga_translator.target
+            )
+        except Exception as e:
+            print(f"Copilot batch translation failed: {e}")
+            translated_texts = texts_to_translate  # Return original on error
     else:
         # Single translation for other translators
         translated_texts = [manga_translator.translate(t, method=selected_translator) for t in texts_to_translate]
     # Phase 3: Add translated text to bubbles
+    # Determine correct font path based on font name
+    font_path = get_font_path(selected_font)
     for data, translated_text in zip(bubble_data, translated_texts):
         add_text(data['detected_image'], translated_text, font_path, data['contour'])
     return image
+def get_font_path(font_name: str) -> str:
+    """Get the correct font file path based on font name."""
+    # Handle legacy fonts with 'i' suffix
+    if font_name in ["animeace_", "arial", "mangat"]:
+        return f"fonts/{font_name}i.ttf"
+    # Yuki-* fonts use exact name
+    elif font_name.startswith("Yuki-") or font_name.startswith("yuki-"):
+        return f"fonts/{font_name}.ttf"
+    else:
+        return f"fonts/{font_name}.ttf"
+def process_images_with_batch(images_data, manga_translator, mocr, selected_font, translator_type, batch_size=10, use_context_memory=True):
+    """
+    Process multiple images with multi-page batching for Copilot or Gemini.
+    Collects all texts first, batch translates, then applies translations.
+    Args:
+        images_data: List of dicts with 'image', 'name' keys
+        manga_translator: MangaTranslator instance with translator
+        mocr: OCR engine
+        selected_font: Font to use
+        translator_type: 'copilot' or 'gemini'
+        batch_size: Number of pages per API call
+        use_context_memory: Whether to include context from all pages for better translation
+    Returns:
+        List of processed images with translations applied
+    """
+    import time
+    from concurrent.futures import ThreadPoolExecutor, as_completed
+    def emit_progress(phase, current, total, message):
+        """Emit progress update via WebSocket."""
+        try:
+            socketio.emit('progress', {
+                'phase': phase,
+                'current': current,
+                'total': total,
+                'message': message,
+                'percent': int((current / max(total, 1)) * 100)
+            })
+        except Exception as e:
+            pass  # Silently fail if socket not connected
+    total_images = len(images_data)
+    print(f"\n{'='*50}")
+    print(f"Processing {total_images} images...")
+    print(f"Context Memory: {'ON' if use_context_memory else 'OFF'}")
+    print(f"{'='*50}")
+    start_time = time.time()
+    # Check if using Chrome Lens OCR (has batch support)
+    use_batch_ocr = hasattr(mocr, 'process_batch')
+    # Phase 1a: Detect bubbles and collect all bubble images
+    print("\n[Phase 1] Detecting bubbles...")
+    emit_progress('detection', 0, total_images, 'Bắt đầu phát hiện speech bubbles...')
+    all_pages_data = {}  # {page_name: {'image': img, 'bubbles': [...], 'bubble_images': [...]}}
+    all_bubble_images = []  # Flat list for batch OCR
+    bubble_mapping = []  # [(page_name, bubble_idx), ...] to map back
+    for idx, img_data in enumerate(images_data):
+        image = img_data['image']
+        name = img_data['name']
+        emit_progress('detection', idx + 1, total_images, f'Phát hiện bubbles: {name}')
+        print(f"  [{idx+1}/{total_images}] {name}", end="", flush=True)
+        results = detect_bubbles(MODEL_PATH, image)
+        if not results:
+            all_pages_data[name] = {'image': image, 'bubbles': [], 'texts': []}
+            print(f" - 0 bubbles")
+            continue
+        print(f" - {len(results)} bubbles")
+        bubble_data = []
+        for bubble_idx, result in enumerate(results):
+            x1, y1, x2, y2, score, class_id = result
+            detected_image = image[int(y1):int(y2), int(x1):int(x2)]
+            # IMPORTANT: Add to OCR queue BEFORE processing (which fills white)
+            all_bubble_images.append(Image.fromarray(detected_image.copy()))
+            bubble_mapping.append((name, bubble_idx))
+            # Process bubble (fill white) - this modifies the original image via view
+            processed_image, cont = process_bubble(detected_image)
+            bubble_data.append({
+                'detected_image': processed_image,
+                'contour': cont,
+                'coords': (int(x1), int(y1), int(x2), int(y2))
+            })
+        all_pages_data[name] = {
+            'image': image,
+            'bubbles': bubble_data,
+            'texts': []  # Will fill after OCR
+        }
+    detection_time = time.time() - start_time
+    print(f"✓ Bubble detection completed in {detection_time:.1f}s ({len(all_bubble_images)} total bubbles)")
+    emit_progress('detection', total_images, total_images, f'Phát hiện xong {len(all_bubble_images)} bubbles')
+    # Phase 1b: Batch OCR all bubbles at once
+    if all_bubble_images:
+        ocr_start = time.time()
+        emit_progress('ocr', 0, 1, f'Đang OCR {len(all_bubble_images)} bubbles...')
+        print(f"\n[Phase 2] OCR processing {len(all_bubble_images)} bubbles...", end=" ", flush=True)
+        if use_batch_ocr:
+            # Use concurrent batch OCR (Chrome Lens)
+            all_texts = mocr.process_batch(all_bubble_images)
+        else:
+            # Sequential OCR (MangaOcr or others)
+            all_texts = [mocr(img) for img in all_bubble_images]
+        # Map texts back to pages
+        for (page_name, bubble_idx), text in zip(bubble_mapping, all_texts):
+            all_pages_data[page_name]['texts'].append(text)
+        ocr_time = time.time() - ocr_start
+        print(f"({ocr_time:.1f}s)")
+        print(f"✓ OCR completed in {ocr_time:.1f}s ({len(all_bubble_images)/ocr_time:.1f} bubbles/sec)")
+        emit_progress('ocr', 1, 1, f'OCR hoàn tất ({len(all_bubble_images)} bubbles)')
+    # Phase 3: Batch translate all pages together
+    emit_progress('translation', 0, 1, 'Đang dịch...')
+    pages_texts = {name: data['texts'] for name, data in all_pages_data.items() if data['texts']}
+    all_translations = {}
+    if pages_texts:
+        # Get the translator based on type
+        if translator_type == "copilot" and hasattr(manga_translator, '_copilot_translator') and manga_translator._copilot_translator:
+            translator = manga_translator._copilot_translator
+            translator_name = "Copilot"
+        elif translator_type == "gemini" and hasattr(manga_translator, '_gemini_translator') and manga_translator._gemini_translator:
+            translator = manga_translator._gemini_translator
+            translator_name = "Gemini"
+        else:
+            translator = None
+            translator_name = "Unknown"
+        if translator:
+            print(f"{translator_name} batch translating {len(pages_texts)} pages in chunks of {batch_size}...")
+            # Build full context from ALL pages if context memory is enabled
+            all_context = None
+            if use_context_memory:
+                all_context = pages_texts  # Pass all texts for context
+                print(f"  Using context from all {len(pages_texts)} pages")
+            # Process in batches
+            page_names = list(pages_texts.keys())
+            for i in range(0, len(page_names), batch_size):
+                batch_names = page_names[i:i + batch_size]
+                batch_texts = {name: pages_texts[name] for name in batch_names}
+                print(f"  Translating batch {i//batch_size + 1}: pages {i+1}-{min(i+batch_size, len(page_names))}")
+                try:
+                    translated = translator.translate_pages_batch(
+                        batch_texts,
+                        source=manga_translator.source,
+                        target=manga_translator.target,
+                        context=all_context if use_context_memory else None
+                    )
+                    all_translations.update(translated)
+                except Exception as e:
+                    print(f"  Batch failed: {e}, falling back to individual translation")
+                    for name, texts in batch_texts.items():
+                        try:
+                            all_translations[name] = translator.translate_batch(
+                                texts, manga_translator.source, manga_translator.target
+                            )
+                        except:
+                            all_translations[name] = texts  # Return original on error
+    translation_time = time.time() - start_time - detection_time
+    print(f"✓ Translation completed in {translation_time:.1f}s")
+    emit_progress('translation', 1, 1, 'Dịch hoàn tất')
+    # Phase 4: Apply translations and render text
+    emit_progress('rendering', 0, total_images, 'Đang render text vào ảnh...')
+    render_start = time.time()
+    processed_results = []
+    font_path = get_font_path(selected_font)
+    print(f"\n[Phase 4] Rendering text...")
+    render_idx = 0
+    for name, data in all_pages_data.items():
+        render_idx += 1
+        emit_progress('rendering', render_idx, total_images, f'Render text: {name}')
+        image = data['image']
+        bubbles = data['bubbles']
+        translated_texts = all_translations.get(name, data['texts'])  # Fallback to original
+        # Apply text to bubbles on the ORIGINAL image
+        for bubble, text in zip(bubbles, translated_texts):
+            x1, y1, x2, y2 = bubble['coords']
+            # Get the region in the original image (this is a view, modifications affect original)
+            bubble_region = image[y1:y2, x1:x2]
+            # Fill with white first (process_bubble already did this but let's be safe)
+            # bubble_region[:] = (255, 255, 255)  # Already done
+            # Add translated text
+            add_text(bubble_region, text, font_path, bubble['contour'])
+        processed_results.append({
+            'image': image,
+            'name': name
+        })
+    render_time = time.time() - render_start
+    total_time = time.time() - start_time
+    print(f"✓ Text rendering completed in {render_time:.1f}s")
+    print(f"{'='*50}")
+    print(f"✓ TOTAL: {total_images} images processed in {total_time:.1f}s ({total_time/total_images:.1f}s/image)")
+    print(f"{'='*50}\n")
+    emit_progress('done', total_images, total_images, f'Hoàn tất! {total_images} ảnh trong {total_time:.1f}s')
+    return processed_results
 @app.route("/translate", methods=["POST"])
 def upload_file():
     # Get translator selection
     translator_map = {
         "Opus-mt model": "hf",
         "NLLB": "nllb",
+        "Gemini": "gemini",
+        "Copilot": "copilot"
     }
     selected_translator = translator_map.get(
         request.form["selected_translator"],
         request.form["selected_translator"].lower()
     )
+    # Get Copilot settings if Copilot is selected
+    copilot_server = request.form.get("copilot_server", "http://localhost:8080")
+    copilot_model = request.form.get("selected_copilot_model", "gpt-4o")
+    # Get Gemini API key from form
+    gemini_api_key = request.form.get("gemini_api_key", "").strip()
+    # Get context memory setting (checkbox - "on" if checked, None if not)
+    use_context_memory = request.form.get("context_memory") == "on"
     # Get font selection
+    selected_font_raw = request.form["selected_font"]
+    selected_font = selected_font_raw.lower()
+    # Handle special font name mappings
+    if selected_font == "auto (match original)":
+        selected_font = "auto"
+    elif selected_font == "animeace":
+        selected_font = "animeace_"
+    elif selected_font_raw.startswith("Yuki-"):
+        # Keep original case for Yuki fonts
+        selected_font = selected_font_raw
     # Get OCR engine
     selected_ocr = request.form.get("selected_ocr", "chrome-lens").lower()
     if selected_translator == "gemini" and style:
         manga_translator._gemini_custom_prompt = style
+    # Set Gemini API key
+    if selected_translator == "gemini" and gemini_api_key:
+        manga_translator._gemini_api_key = gemini_api_key
+        print(f"Using Gemini API with provided key")
+    # Set Copilot settings
+    if selected_translator == "copilot":
+        manga_translator._copilot_server = copilot_server
+        manga_translator._copilot_model = copilot_model
+        print(f"Using Copilot API: {copilot_server} / model: {copilot_model}")
     if selected_ocr == "chrome-lens":
         mocr = ChromeLensOCR()
     else:
         mocr = MangaOcr()
+    # Initialize font analyzer for auto font matching
+    font_analyzer = None
+    if selected_font == "auto":
+        try:
+            from font_analyzer import FontAnalyzer
+            # Use same API key as Gemini translator
+            api_key = gemini_api_key or os.environ.get("GEMINI_API_KEY")
+            if not api_key:
+                print("Warning: No Gemini API key provided for font analysis")
+            font_analyzer = FontAnalyzer(api_key=api_key)
+            print("Font analyzer initialized for auto font matching")
+        except Exception as e:
+            print(f"Failed to initialize font analyzer: {e}")
+            selected_font = "animeace_"  # Fallback to default
     # Process all images
     processed_images = []
+    auto_font_determined = False  # Flag to analyze font only once
+    # For Copilot and Gemini: Use multi-page batch processing
+    if selected_translator in ["copilot", "gemini"]:
+        # First, read all images into memory
+        all_images = []
+        for file in files:
+            if file and file.filename:
+                try:
+                    file_stream = file.stream
+                    file_bytes = np.frombuffer(file_stream.read(), dtype=np.uint8)
+                    image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
+                    if image is None:
+                        continue
+                    name = os.path.splitext(file.filename)[0]
+                    all_images.append({'image': image, 'name': name})
+                except Exception as e:
+                    print(f"Error reading {file.filename}: {e}")
+        if not all_images:
+            return redirect("/")
+        # Auto font: analyze first image
+        if selected_font == "auto" and font_analyzer is not None:
             try:
+                results = detect_bubbles(MODEL_PATH, all_images[0]['image'])
+                if results:
+                    x1, y1, x2, y2, _, _ = results[0]
+                    first_bubble = all_images[0]['image'][int(y1):int(y2), int(x1):int(x2)]
+                    selected_font = font_analyzer.analyze_and_match(first_bubble)
+                    print(f"Auto font matched: {selected_font}")
+                else:
+                    selected_font = "animeace_"
+            except Exception as e:
+                print(f"Font analysis failed: {e}")
+                selected_font = "animeace_"
+        # Initialize translator based on type
+        if selected_translator == "copilot":
+            if not hasattr(manga_translator, '_copilot_translator') or manga_translator._copilot_translator is None:
+                from translator.copilot_translator import CopilotTranslator
+                manga_translator._copilot_translator = CopilotTranslator(
+                    server_url=copilot_server,
+                    model=copilot_model
                 )
+                print(f"Copilot translator initialized: {copilot_server} / {copilot_model}")
+        elif selected_translator == "gemini":
+            if not hasattr(manga_translator, '_gemini_translator') or manga_translator._gemini_translator is None:
+                from translator.gemini_translator import GeminiTranslator
+                api_key = gemini_api_key
+                if not api_key:
+                    raise ValueError("Gemini API key required. Please enter it in the web form.")
+                custom_prompt = getattr(manga_translator, '_gemini_custom_prompt', None)
+                manga_translator._gemini_translator = GeminiTranslator(
+                    api_key=api_key,
+                    custom_prompt=custom_prompt
+                )
+                print("Gemini translator initialized for multi-page batching")
+        # Process with multi-page batching (10 pages per API call)
+        processed_results = process_images_with_batch(
+            all_images, manga_translator, mocr, selected_font,
+            translator_type=selected_translator, batch_size=10,
+            use_context_memory=use_context_memory
+        )
+        # Encode results to base64
+        for result in processed_results:
+            try:
+                _, buffer = cv2.imencode(".jpg", result['image'], [cv2.IMWRITE_JPEG_QUALITY, 95])
                 encoded_image = base64.b64encode(buffer.tobytes()).decode("utf-8")
                 processed_images.append({
+                    "name": result['name'],
                     "data": encoded_image
                 })
             except Exception as e:
+                print(f"Error encoding {result['name']}: {e}")
+    else:
+        # For other translators: Use per-image processing (original flow)
+        for file in files:
+            if file and file.filename:
+                try:
+                    # Read image
+                    file_stream = file.stream
+                    file_bytes = np.frombuffer(file_stream.read(), dtype=np.uint8)
+                    image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
+                    if image is None:
+                        continue
+                    # Auto font: analyze FIRST image only
+                    if selected_font == "auto" and font_analyzer is not None and not auto_font_determined:
+                        try:
+                            results = detect_bubbles(MODEL_PATH, image)
+                            if results:
+                                x1, y1, x2, y2, _, _ = results[0]
+                                first_bubble = image[int(y1):int(y2), int(x1):int(x2)]
+                                selected_font = font_analyzer.analyze_and_match(first_bubble)
+                                print(f"Auto font matched (once for all images): {selected_font}")
+                            else:
+                                selected_font = "animeace_"
+                        except Exception as e:
+                            print(f"Font analysis failed: {e}")
+                            selected_font = "animeace_"
+                        auto_font_determined = True
+                    # Get original filename
+                    name = os.path.splitext(file.filename)[0]
+                    # Process image
+                    processed_image = process_single_image(
+                        image, manga_translator, mocr,
+                        selected_translator, selected_font, None
+                    )
+                    # Encode to base64
+                    _, buffer = cv2.imencode(".jpg", processed_image, [cv2.IMWRITE_JPEG_QUALITY, 95])
+                    encoded_image = base64.b64encode(buffer.tobytes()).decode("utf-8")
+                    processed_images.append({
+                        "name": name,
+                        "data": encoded_image
+                    })
+                except Exception as e:
+                    print(f"Error processing {file.filename}: {e}")
+                    continue
     if not processed_images:
         return redirect("/")
     return render_template("translate.html", images=processed_images)
+@app.route("/download-zip", methods=["POST"])
+def download_zip():
+    """Create and download a ZIP file containing all translated images."""
+    try:
+        images_data = request.form.get("images_data", "[]")
+        images = json.loads(images_data)
+        if not images:
+            return redirect("/")
+        # Create ZIP file in memory
+        zip_buffer = io.BytesIO()
+        with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zip_file:
+            for i, img in enumerate(images):
+                name = img.get('name', f'image_{i+1}')
+                data = img.get('data', '')
+                # Decode base64 to bytes
+                image_bytes = base64.b64decode(data)
+                # Add to ZIP with proper filename
+                filename = f"{name}_translated.png"
+                zip_file.writestr(filename, image_bytes)
+        zip_buffer.seek(0)
+        return send_file(
+            zip_buffer,
+            mimetype='application/zip',
+            as_attachment=True,
+            download_name='manga_translated.zip'
+        )
+    except Exception as e:
+        print(f"Error creating ZIP: {e}")
+        return redirect("/")
 if __name__ == "__main__":
+    socketio.run(app, debug=True)

font_analyzer.py ADDED Viewed

	@@ -0,0 +1,151 @@

+"""
+Font Analyzer - Analyze manga font style and match with available fonts
+Uses Gemini Vision to directly select the best matching font from available options
+"""
+import google.generativeai as genai
+import json
+import os
+from PIL import Image
+import numpy as np
+from typing import Optional, Dict, Any, List
+class FontAnalyzer:
+    """
+    Analyzes font style from manga speech bubbles using Gemini Vision
+    and directly selects the best matching font from available fonts.
+    """
+    # Available fonts with descriptions for Gemini to understand
+    FONT_OPTIONS = {
+        "animeace_": "Classic manga font, clean and readable, standard comic style",
+        "mangat": "Standard manga font, similar to animeace, good readability",
+        "arial": "Clean sans-serif, formal and professional",
+        "Yuki-Arenzi": "Simple casual handwritten style",
+        "Yuki-Burobu": "Bold brush strokes, dynamic action style, Japanese brush feel",
+        "Yuki-CCMarianChurchlandJournal": "Journal/diary handwritten, personal feel",
+        "Yuki-CDX Starstreak": "Dynamic sci-fi style, bold and futuristic",
+        "Yuki-CHICKEN Pie": "Playful, chunky, cute comedy style",
+        "Yuki-CrashLanding BB": "Heavy impact font, bold action/shouting style",
+        "Yuki-Downhill Dive": "Dynamic sports/action font, energetic",
+        "Yuki-Gingerline DEMO Regular": "Elegant flowing handwritten, romantic style",
+        "Yuki-Gorrilaz_Story": "Grunge alternative style, rough edges",
+        "Yuki-KG Only Angel": "Delicate feminine handwritten, soft romantic",
+        "Yuki-LF SwandsHand": "Natural handwritten, casual personal",
+        "Yuki-La Belle Aurore": "Elegant cursive, fancy romantic style",
+        "Yuki-Little Cupcakes": "Cute kawaii style, bubbly and fun",
+        "Yuki-Nagurigaki Crayon": "Crayon/childish handwritten, playful comedy",
+        "Yuki-Ripsnort BB": "Heavy bold impact, action/shouting",
+        "Yuki-Roasthink": "Modern clean sans-serif, general purpose",
+        "Yuki-Screwball": "Comic style, funny and expressive",
+        "Yuki-Shark Crash": "Aggressive dynamic, action manga style",
+        "Yuki-Skulduggery": "Gothic dark style, horror/mystery",
+        "Yuki-Superscratchy": "Scratchy rough handwritten, grungy feel",
+        "Yuki-Tea And Oranges Regular": "Soft warm handwritten, gentle drama",
+    }
+    DEFAULT_FONT = "animeace_"
+    def __init__(self, api_key: str = None):
+        """Initialize with Gemini API key."""
+        self.api_key = api_key or os.environ.get("GEMINI_API_KEY")
+        if not self.api_key:
+            raise ValueError("Gemini API key required. Set GEMINI_API_KEY or pass api_key.")
+        genai.configure(api_key=self.api_key)
+        self.model = genai.GenerativeModel("gemini-2.5-flash-lite")
+    def _image_to_pil(self, image) -> Image.Image:
+        """Convert various image formats to PIL Image."""
+        if isinstance(image, Image.Image):
+            return image
+        elif isinstance(image, np.ndarray):
+            import cv2
+            if len(image.shape) == 3 and image.shape[2] == 3:
+                image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+            return Image.fromarray(image)
+        else:
+            raise ValueError(f"Unsupported image type: {type(image)}")
+    def _build_font_list_prompt(self) -> str:
+        """Build the font options list for the prompt."""
+        lines = []
+        for font_name, description in self.FONT_OPTIONS.items():
+            lines.append(f"- {font_name}: {description}")
+        return "\n".join(lines)
+    def analyze_and_match(self, bubble_image) -> str:
+        """
+        Analyze the font in the image and directly select the best matching font.
+        Args:
+            bubble_image: Speech bubble image (PIL, numpy array)
+        Returns:
+            Font name to use
+        """
+        try:
+            pil_image = self._image_to_pil(bubble_image)
+            print(f"[FontAnalyzer] Analyzing image size: {pil_image.size}")
+            font_list = self._build_font_list_prompt()
+            prompt = f"""Look at this manga/comic speech bubble image and analyze the text font style.
+Then choose the BEST matching font from this list based on visual similarity:
+{font_list}
+Consider these factors when matching:
+1. Font weight (thin, normal, bold, heavy)
+2. Style (clean, handwritten, decorative, brush)
+3. Mood/genre (action, comedy, romance, horror, drama, casual)
+4. Overall visual feel
+Return ONLY the font name (exactly as written above), nothing else.
+Example response: Yuki-Burobu"""
+            print("[FontAnalyzer] Sending request to Gemini Vision...")
+            response = self.model.generate_content([prompt, pil_image])
+            result = response.text.strip()
+            print(f"[FontAnalyzer] Gemini raw response: '{result}'")
+            # Clean up response
+            result = result.replace('"', '').replace("'", "").strip()
+            # Remove common prefixes that Gemini might add
+            prefixes_to_remove = ["The best matching font is ", "Best match: ", "Font: ", "I recommend "]
+            for prefix in prefixes_to_remove:
+                if result.lower().startswith(prefix.lower()):
+                    result = result[len(prefix):].strip()
+            print(f"[FontAnalyzer] Cleaned response: '{result}'")
+            # Validate the result is in our font list
+            if result in self.FONT_OPTIONS:
+                print(f"[FontAnalyzer] ✓ Matched: {result}")
+                return result
+            # Try to find partial match (case-insensitive)
+            result_lower = result.lower()
+            for font_name in self.FONT_OPTIONS.keys():
+                if font_name.lower() == result_lower:
+                    print(f"[FontAnalyzer] ✓ Matched (case-insensitive): {font_name}")
+                    return font_name
+                if font_name.lower() in result_lower or result_lower in font_name.lower():
+                    print(f"[FontAnalyzer] ✓ Matched (partial): {font_name}")
+                    return font_name
+            print(f"[FontAnalyzer] ✗ Font not in list: '{result}', using default")
+            return self.DEFAULT_FONT
+        except Exception as e:
+            print(f"[FontAnalyzer] ✗ Error: {e}")
+            return self.DEFAULT_FONT
+def get_matching_font(bubble_image, api_key: str = None) -> str:
+    """Quick function to analyze and match font from a bubble image."""
+    analyzer = FontAnalyzer(api_key)
+    return analyzer.analyze_and_match(bubble_image)

fonts/Yuki-Arenzi.ttf ADDED Viewed

Binary file (47.8 kB). View file

fonts/Yuki-Burobu.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:09a3b22d7035b4726304fb383cf80e2421c47cf05615d2f75143b24147bcef7a
+size 176976

fonts/Yuki-CCMarianChurchlandJournal.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9c5c4ac2b3daf8f7062d745300b2e8dd12b2ee206db7dd427143cc3f78a8e831
+size 148928

fonts/Yuki-CDX Starstreak.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:931d143968eca5b237efdba4538ddd79a8113a438c9d0b479244a660cc099973
+size 152740

fonts/Yuki-CHICKEN Pie.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:12b1128c44ecc4819fc67615966260cca27ac68a6b21f4c2a99d697656f3cfe2
+size 100624

fonts/Yuki-CrashLanding BB.ttf ADDED Viewed

Binary file (49.4 kB). View file

fonts/Yuki-Downhill Dive.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a19185b9489d1ff3897d178e7859676d58c1a8ab81beee9e93b662a1a8a0383d
+size 345480

fonts/Yuki-Gingerline DEMO Regular.ttf ADDED Viewed

Binary file (82.5 kB). View file

fonts/Yuki-Gorrilaz_Story.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:331cec198091d01a819b0dfb4be4576cdc27a0774397a4ec7a9b10a527a5d161
+size 115792

fonts/Yuki-KG Only Angel.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f07ee1dfc8c198e19dfe1101b2bb1d84c80592ed54fc6caf2412e50d36b22903
+size 440976

fonts/Yuki-LF SwandsHand.ttf ADDED Viewed

Binary file (70.9 kB). View file

fonts/Yuki-La Belle Aurore.ttf ADDED Viewed

Binary file (88.9 kB). View file

fonts/Yuki-Little Cupcakes.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:30a1f57bac5c5fcb5739008d33ede51e5ecc8a76a39f268ecbeb4b0c0e45fa68
+size 114520

fonts/Yuki-Nagurigaki Crayon.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bece77302985b034d3e7beff562853e17084b87d2b2fef6ba784fdc953660586
+size 5462384

fonts/Yuki-Ripsnort BB.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cf74ab38ba5767007fbc4b0cf8cfa432620a22d431be3d4b944df1dd3ca1b2f3
+size 115368

fonts/Yuki-Roasthink.ttf ADDED Viewed

Binary file (68.4 kB). View file

fonts/Yuki-Screwball.ttf ADDED Viewed

Binary file (99.1 kB). View file

fonts/Yuki-Shark Crash.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:12195563cabdc781dffa6517ad1f029fc69dcb968e9ff49ec68f3c7216cc4c3c
+size 148464

fonts/Yuki-Skulduggery.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:06cbbcaa3cfcbf96482c18301130e48405c06ff205c5a225abb44d5b56f7d299
+size 434812

fonts/Yuki-Superscratchy.ttf ADDED Viewed

Binary file (68.4 kB). View file

fonts/Yuki-Tea And Oranges Regular.ttf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7a448eb8e6301e6059678171d36c95049723d6a3d26fcb4d33b9b62e40397df9
+size 492108

ocr/__pycache__/chrome_lens_ocr.cpython-311.pyc CHANGED Viewed

Binary files a/ocr/__pycache__/chrome_lens_ocr.cpython-311.pyc and b/ocr/__pycache__/chrome_lens_ocr.cpython-311.pyc differ

ocr/chrome_lens_ocr.py CHANGED Viewed

@@ -17,6 +17,7 @@ class ChromeLensOCR:
     - Free Google Lens OCR API
     - Multi-language support with auto-detection
     - Text block segmentation for comics/manga
     """
     def __init__(self, ocr_language: str = "ja"):
@@ -77,6 +78,62 @@ class ChromeLensOCR:
             print(f"Chrome Lens OCR error: {e}")
             return ""
     async def process_with_blocks(self, image) -> dict:
         """
         Process image and return text segmented into blocks.
@@ -114,3 +171,4 @@ class ChromeLensOCR:
         result = asyncio.run(self.process_with_blocks(image))
         return result.get("text_blocks", [])

     - Free Google Lens OCR API
     - Multi-language support with auto-detection
     - Text block segmentation for comics/manga
+    - Batch processing for faster multi-image OCR
     """
     def __init__(self, ocr_language: str = "ja"):
             print(f"Chrome Lens OCR error: {e}")
             return ""
+    def process_batch(self, images: list) -> list:
+        """
+        Process multiple images concurrently for faster OCR.
+        Args:
+            images: List of PIL Images or numpy arrays
+        Returns:
+            list: List of extracted texts in same order
+        """
+        # Convert numpy arrays to PIL Images
+        pil_images = []
+        for img in images:
+            if isinstance(img, np.ndarray):
+                pil_images.append(Image.fromarray(img))
+            else:
+                pil_images.append(img)
+        # Run batch processing
+        try:
+            loop = asyncio.get_running_loop()
+            import concurrent.futures
+            future = asyncio.run_coroutine_threadsafe(
+                self._process_batch(pil_images), loop
+            )
+            return future.result(timeout=120)
+        except RuntimeError:
+            if not hasattr(self, '_loop') or self._loop.is_closed():
+                self._loop = asyncio.new_event_loop()
+            return self._loop.run_until_complete(self._process_batch(pil_images))
+    async def _process_batch(self, images: list) -> list:
+        """
+        Async batch processing using asyncio.gather for concurrent OCR.
+        Args:
+            images: List of PIL Images
+        Returns:
+            list: List of extracted texts
+        """
+        # Process all images concurrently
+        tasks = [self._process(img) for img in images]
+        results = await asyncio.gather(*tasks, return_exceptions=True)
+        # Handle any exceptions
+        processed = []
+        for r in results:
+            if isinstance(r, Exception):
+                print(f"Batch OCR error: {r}")
+                processed.append("")
+            else:
+                processed.append(r)
+        return processed
     async def process_with_blocks(self, image) -> dict:
         """
         Process image and return text segmented into blocks.
         result = asyncio.run(self.process_with_blocks(image))
         return result.get("text_blocks", [])

process_bubble.py CHANGED Viewed

@@ -11,12 +11,22 @@ def process_bubble(image):
     Returns:
     - image (numpy.ndarray):  Image with the speech bubble content set to white.
-    - largest_contour (numpy.ndarray): Contour of the detected speech bubble.
     """
     gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
     _, thresh = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY)
     contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
     largest_contour = max(contours, key=cv2.contourArea)
     mask = np.zeros_like(gray)
@@ -25,3 +35,4 @@ def process_bubble(image):
     image[mask == 255] = (255, 255, 255)
     return image, largest_contour

     Returns:
     - image (numpy.ndarray):  Image with the speech bubble content set to white.
+    - largest_contour (numpy.ndarray): Contour of the detected speech bubble (or None if not found).
     """
     gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
     _, thresh = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY)
     contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+    # Handle case when no contours found
+    if not contours:
+        # Return original image with a simple rectangular contour
+        h, w = image.shape[:2]
+        largest_contour = np.array([[0, 0], [w, 0], [w, h], [0, h]], dtype=np.int32)
+        # Fill with white anyway
+        image[:] = (255, 255, 255)
+        return image, largest_contour
     largest_contour = max(contours, key=cv2.contourArea)
     mask = np.zeros_like(gray)
     image[mask == 255] = (255, 255, 255)
     return image, largest_contour

static/css/style.css CHANGED Viewed

@@ -334,6 +334,55 @@ button:active {
 	color: white;
 }
 /* Responsive */
 @media (max-width: 600px) {
 	.form-grid {
@@ -344,3 +393,4 @@ button:active {
 		padding: 20px;
 	}
 }

 	color: white;
 }
+/* Toggle Switch */
+.toggle-container {
+	display: flex;
+	align-items: center;
+	cursor: pointer;
+	gap: 12px;
+	user-select: none;
+}
+.toggle-container input {
+	display: none;
+}
+.toggle-slider {
+	position: relative;
+	width: 50px;
+	height: 26px;
+	background-color: #ccc;
+	border-radius: 26px;
+	transition: background-color 0.3s;
+	flex-shrink: 0;
+}
+.toggle-slider::before {
+	content: '';
+	position: absolute;
+	width: 22px;
+	height: 22px;
+	border-radius: 50%;
+	background-color: white;
+	top: 2px;
+	left: 2px;
+	transition: transform 0.3s;
+	box-shadow: 0 2px 4px rgba(0, 0, 0, 0.2);
+}
+.toggle-container input:checked + .toggle-slider {
+	background-color: #5E1675;
+}
+.toggle-container input:checked + .toggle-slider::before {
+	transform: translateX(24px);
+}
+.toggle-label {
+	font-size: 13px;
+	color: #333;
+}
 /* Responsive */
 @media (max-width: 600px) {
 	.form-grid {
 		padding: 20px;
 	}
 }

static/js/app.js CHANGED Viewed

@@ -38,6 +38,23 @@ document.addEventListener("DOMContentLoaded", () => {
                         customWrapper.style.display = 'none';
                     }
                 }
             });
         });
@@ -49,6 +66,33 @@ document.addEventListener("DOMContentLoaded", () => {
             }
         });
     });
 });
 // Handles multiple file upload change event
@@ -108,6 +152,17 @@ function updateHiddenInputs() {
     document.getElementById("selected_style").value = getSelectedText("style");
     document.getElementById("selected_font").value = getSelectedText("font");
     document.getElementById("selected_ocr").value = getSelectedText("ocr");
     // Check if files are selected
     const files = document.getElementById('file-upload').files;

                         customWrapper.style.display = 'none';
                     }
                 }
+                // Show/hide translator-specific settings
+                if (selectBox.id === 'translator') {
+                    const copilotSettings = document.getElementById('copilot-settings');
+                    const geminiSettings = document.getElementById('gemini-settings');
+                    if (option.textContent === 'Copilot') {
+                        copilotSettings.style.display = 'block';
+                        geminiSettings.style.display = 'none';
+                    } else if (option.textContent === 'Gemini') {
+                        copilotSettings.style.display = 'none';
+                        geminiSettings.style.display = 'block';
+                    } else {
+                        copilotSettings.style.display = 'none';
+                        geminiSettings.style.display = 'none';
+                    }
+                }
             });
         });
             }
         });
     });
+    // Load saved Gemini API key from localStorage
+    const geminiKeyInput = document.getElementById('gemini_api_key');
+    if (geminiKeyInput) {
+        const savedKey = localStorage.getItem('gemini_api_key');
+        if (savedKey) {
+            geminiKeyInput.value = savedKey;
+        }
+        // Save to localStorage on input change
+        geminiKeyInput.addEventListener('input', () => {
+            localStorage.setItem('gemini_api_key', geminiKeyInput.value);
+        });
+    }
+    // Load saved Copilot server URL from localStorage
+    const copilotServerInput = document.getElementById('copilot_server');
+    if (copilotServerInput) {
+        const savedServer = localStorage.getItem('copilot_server');
+        if (savedServer) {
+            copilotServerInput.value = savedServer;
+        }
+        copilotServerInput.addEventListener('input', () => {
+            localStorage.setItem('copilot_server', copilotServerInput.value);
+        });
+    }
 });
 // Handles multiple file upload change event
     document.getElementById("selected_style").value = getSelectedText("style");
     document.getElementById("selected_font").value = getSelectedText("font");
     document.getElementById("selected_ocr").value = getSelectedText("ocr");
+    document.getElementById("selected_copilot_model").value = getSelectedText("copilot_model");
+    // Validate Gemini API key if Gemini is selected
+    const translator = getSelectedText("translator");
+    if (translator === 'Gemini') {
+        const apiKey = document.getElementById('gemini_api_key').value;
+        if (!apiKey || apiKey.trim() === '') {
+            alert('Vui lòng nhập Gemini API Key!');
+            return false;
+        }
+    }
     // Check if files are selected
     const files = document.getElementById('file-upload').files;

templates/index.html CHANGED Viewed

@@ -70,6 +70,7 @@
                   </div>
                   <div class="options">
                      <span class="option">Gemini</span>
                      <span class="option">Google</span>
                      <span class="option">NLLB</span>
                      <span class="option">Baidu</span>
@@ -107,9 +108,31 @@
                      <span class="icon">&#9660;</span>
                   </div>
                   <div class="options">
                      <span class="option">Animeace</span>
                      <span class="option">Mangat</span>
                      <span class="option">Arial</span>
                   </div>
                </div>
             </div>
@@ -129,6 +152,15 @@
             </div>
          </div>
          <!-- Custom Prompt (show when Custom selected) -->
          <div class="select-wrapper full-width" id="custom-prompt-wrapper" style="display: none;">
             <label class="translator-label">Custom Prompt</label>
@@ -136,6 +168,67 @@
                placeholder="Ví dụ: Dịch theo phong cách light novel, giữ nguyên tên nhân vật..." rows="2"></textarea>
          </div>
          <!-- File upload -->
          <input id="file-upload" type="file" name="files" accept=".jpg, .jpeg, .png" multiple required>
          <label for="file-upload" class="file" id="file-label">
@@ -149,13 +242,68 @@
          <input type="hidden" id="selected_style" name="selected_style">
          <input type="hidden" id="selected_font" name="selected_font">
          <input type="hidden" id="selected_ocr" name="selected_ocr">
          <button type="submit">Translate</button>
       </form>
       <img id="loading-img" src="{{ url_for('static', filename='img/loading.gif') }}" alt="">
       <p id="loading-p">Đang xử lý... Vui lòng đợi!</p>
    </div>
    <script src="{{ url_for('static', filename='js/app.js') }}"></script>
 </body>
 </html>

                   </div>
                   <div class="options">
                      <span class="option">Gemini</span>
+                     <span class="option">Copilot</span>
                      <span class="option">Google</span>
                      <span class="option">NLLB</span>
                      <span class="option">Baidu</span>
                      <span class="icon">&#9660;</span>
                   </div>
                   <div class="options">
+                     <span class="option">Auto (Match Original)</span>
                      <span class="option">Animeace</span>
                      <span class="option">Mangat</span>
                      <span class="option">Arial</span>
+                     <span class="option">Yuki-Arenzi</span>
+                     <span class="option">Yuki-Burobu</span>
+                     <span class="option">Yuki-CCMarianChurchlandJournal</span>
+                     <span class="option">Yuki-CDX Starstreak</span>
+                     <span class="option">Yuki-CHICKEN Pie</span>
+                     <span class="option">Yuki-CrashLanding BB</span>
+                     <span class="option">Yuki-Downhill Dive</span>
+                     <span class="option">Yuki-Gingerline DEMO Regular</span>
+                     <span class="option">Yuki-Gorrilaz_Story</span>
+                     <span class="option">Yuki-KG Only Angel</span>
+                     <span class="option">Yuki-LF SwandsHand</span>
+                     <span class="option">Yuki-La Belle Aurore</span>
+                     <span class="option">Yuki-Little Cupcakes</span>
+                     <span class="option">Yuki-Nagurigaki Crayon</span>
+                     <span class="option">Yuki-Ripsnort BB</span>
+                     <span class="option">Yuki-Roasthink</span>
+                     <span class="option">Yuki-Screwball</span>
+                     <span class="option">Yuki-Shark Crash</span>
+                     <span class="option">Yuki-Skulduggery</span>
+                     <span class="option">Yuki-Superscratchy</span>
+                     <span class="option">Yuki-Tea And Oranges Regular</span>
                   </div>
                </div>
             </div>
             </div>
          </div>
+         <!-- Context Memory Toggle -->
+         <div class="select-wrapper full-width" style="margin-top: 10px;">
+            <label class="toggle-container">
+               <input type="checkbox" id="context_memory" name="context_memory" checked>
+               <span class="toggle-slider"></span>
+               <span class="toggle-label">🧠 Context Memory (dùng context từ tất cả ảnh để dịch chính xác hơn)</span>
+            </label>
+         </div>
          <!-- Custom Prompt (show when Custom selected) -->
          <div class="select-wrapper full-width" id="custom-prompt-wrapper" style="display: none;">
             <label class="translator-label">Custom Prompt</label>
                placeholder="Ví dụ: Dịch theo phong cách light novel, giữ nguyên tên nhân vật..." rows="2"></textarea>
          </div>
+         <!-- Copilot Settings (show when Copilot selected) -->
+         <div id="copilot-settings" style="display: none; width: 100%;">
+            <div class="form-grid">
+               <div class="select-wrapper">
+                  <label class="translator-label">Copilot Server URL</label>
+                  <input type="text" id="copilot_server" name="copilot_server" value="http://localhost:8080"
+                     placeholder="http://localhost:8080"
+                     style="width: 100%; padding: 10px 14px; border: 1px solid #ddd; border-radius: 8px; font-size: 14px;">
+               </div>
+               <div class="select-wrapper">
+                  <label class="translator-label">Model</label>
+                  <div class="custom-select" id="copilot_model" tabindex="0">
+                     <div class="select-box">
+                        <span class="selected"></span>
+                        <span class="icon">&#9660;</span>
+                     </div>
+                     <div class="options">
+                        <!-- ⭐ FREE Unlimited Models -->
+                        <span class="option">gpt-4.1</span>
+                        <span class="option">gpt-4o</span>
+                        <span class="option">gpt-5-mini</span>
+                        <span class="option">grok-code-fast-1</span>
+                        <span class="option">oswe-vscode-prime</span>
+                        <!-- Other Models -->
+                        <span class="option">gpt-5</span>
+                        <span class="option">gpt-5.1</span>
+                        <span class="option">gpt-5.1-codex</span>
+                        <span class="option">gpt-5.1-codex-mini</span>
+                        <span class="option">gpt-5.1-codex-max</span>
+                        <span class="option">gpt-5-codex</span>
+                        <span class="option">gpt-41-copilot</span>
+                        <span class="option">gpt-4o-mini</span>
+                        <span class="option">gpt-4o-2024-11-20</span>
+                        <span class="option">gpt-4</span>
+                        <span class="option">gpt-4-0125-preview</span>
+                        <span class="option">gpt-3.5-turbo</span>
+                        <span class="option">claude-sonnet-4.5</span>
+                        <span class="option">claude-sonnet-4</span>
+                        <span class="option">claude-opus-4.5</span>
+                        <span class="option">claude-haiku-4.5</span>
+                        <span class="option">gemini-3-pro-preview</span>
+                        <span class="option">gemini-2.5-pro</span>
+                     </div>
+                  </div>
+               </div>
+            </div>
+         </div>
+         <!-- Gemini Settings (show when Gemini selected) -->
+         <div id="gemini-settings" style="display: block; width: 100%;">
+            <div class="select-wrapper">
+               <label class="translator-label">Gemini API Key</label>
+               <input type="password" id="gemini_api_key" name="gemini_api_key"
+                  placeholder="Nhập API key của bạn (lấy từ ai.google.dev)"
+                  style="width: 100%; padding: 10px 14px; border: 1px solid #ddd; border-radius: 8px; font-size: 14px;">
+               <small style="color: #666; font-size: 12px; margin-top: 4px; display: block;">
+                  🔒 Key được lưu trong trình duyệt của bạn (localStorage)
+               </small>
+            </div>
+         </div>
          <!-- File upload -->
          <input id="file-upload" type="file" name="files" accept=".jpg, .jpeg, .png" multiple required>
          <label for="file-upload" class="file" id="file-label">
          <input type="hidden" id="selected_style" name="selected_style">
          <input type="hidden" id="selected_font" name="selected_font">
          <input type="hidden" id="selected_ocr" name="selected_ocr">
+         <input type="hidden" id="selected_copilot_model" name="selected_copilot_model">
          <button type="submit">Translate</button>
       </form>
+      <!-- Progress Bar -->
+      <div id="progress-container" style="display: none; margin-top: 20px;">
+         <div id="progress-phase" style="font-size: 12px; color: #666; margin-bottom: 5px; text-align: center;"></div>
+         <div style="background: #e0e0e0; border-radius: 10px; overflow: hidden; height: 20px;">
+            <div id="progress-bar"
+               style="height: 100%; background: linear-gradient(90deg, #5E1675, #8e44ad); width: 0%; transition: width 0.3s ease;">
+            </div>
+         </div>
+         <div id="progress-text" style="font-size: 13px; color: #333; margin-top: 8px; text-align: center;"></div>
+      </div>
       <img id="loading-img" src="{{ url_for('static', filename='img/loading.gif') }}" alt="">
       <p id="loading-p">Đang xử lý... Vui lòng đợi!</p>
    </div>
+   <!-- Socket.IO for real-time progress -->
+   <script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/4.7.2/socket.io.min.js"></script>
    <script src="{{ url_for('static', filename='js/app.js') }}"></script>
+   <script>
+      // Real-time progress updates
+      document.addEventListener('DOMContentLoaded', function () {
+         const socket = io();
+         const progressContainer = document.getElementById('progress-container');
+         const progressBar = document.getElementById('progress-bar');
+         const progressText = document.getElementById('progress-text');
+         const progressPhase = document.getElementById('progress-phase');
+         const phaseNames = {
+            'detection': '🔍 Phát hiện bubbles',
+            'ocr': '📖 OCR nhận dạng text',
+            'translation': '🌐 Dịch văn bản',
+            'rendering': '✏️ Render text vào ảnh',
+            'done': '✅ Hoàn tất'
+         };
+         socket.on('progress', function (data) {
+            progressContainer.style.display = 'block';
+            const phaseName = phaseNames[data.phase] || data.phase;
+            progressPhase.textContent = phaseName;
+            progressBar.style.width = data.percent + '%';
+            progressText.textContent = data.message;
+            if (data.phase === 'done') {
+               progressBar.style.background = 'linear-gradient(90deg, #50C878, #2ecc71)';
+            }
+         });
+         // Show progress when form submitted
+         document.querySelector('form').addEventListener('submit', function () {
+            progressContainer.style.display = 'block';
+            progressBar.style.width = '0%';
+            progressBar.style.background = 'linear-gradient(90deg, #5E1675, #8e44ad)';
+            progressText.textContent = 'Khởi tạo...';
+            progressPhase.textContent = '⏳ Chuẩn bị';
+         });
+      });
+   </script>
 </body>
 </html>

templates/translate.html CHANGED Viewed

@@ -36,10 +36,15 @@
    </div>
    <div class="buttons_image">
-      <a href="#" class="green" id="download-all">📦 Download All</a>
       <a href="/" class="red">← Quay lại</a>
    </div>
 </body>
 <script>
    // Download single image
@@ -55,14 +60,22 @@
       });
    });
-   // Download all images
-   document.getElementById('download-all').addEventListener('click', (e) => {
       e.preventDefault();
-      document.querySelectorAll('.download-btn').forEach((btn, index) => {
-         setTimeout(() => {
-            btn.click();
-         }, index * 300); // Delay between downloads
       });
    });
 </script>

    </div>
    <div class="buttons_image">
+      <a href="#" class="green" id="download-zip">📦 Download ZIP</a>
       <a href="/" class="red">← Quay lại</a>
    </div>
+   <!-- Hidden form for ZIP download -->
+   <form id="zip-form" action="/download-zip" method="POST" style="display: none;">
+      <input type="hidden" name="images_data" id="images-data">
+   </form>
 </body>
 <script>
    // Download single image
       });
    });
+   // Download all images as ZIP
+   document.getElementById('download-zip').addEventListener('click', (e) => {
       e.preventDefault();
+      // Collect all images data
+      const images = [];
+      document.querySelectorAll('.download-btn').forEach(btn => {
+         images.push({
+            name: btn.getAttribute('data-name'),
+            data: btn.getAttribute('data-image')
+         });
       });
+      // Submit form with images data
+      document.getElementById('images-data').value = JSON.stringify(images);
+      document.getElementById('zip-form').submit();
    });
 </script>

translator/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (345 Bytes). View file

translator/__pycache__/copilot_translator.cpython-311.pyc ADDED Viewed

Binary file (16.2 kB). View file

translator/__pycache__/gemini_translator.cpython-311.pyc CHANGED Viewed

Binary files a/translator/__pycache__/gemini_translator.cpython-311.pyc and b/translator/__pycache__/gemini_translator.cpython-311.pyc differ

translator/__pycache__/translator.cpython-311.pyc CHANGED Viewed

Binary files a/translator/__pycache__/translator.cpython-311.pyc and b/translator/__pycache__/translator.cpython-311.pyc differ

translator/copilot_translator.py ADDED Viewed

	@@ -0,0 +1,351 @@

+"""
+Copilot API Translator
+Uses copilot-api proxy server (OpenAI-compatible endpoint)
+https://github.com/ericc-ch/copilot-api
+"""
+import requests
+import json
+from typing import List
+class CopilotTranslator:
+    """
+    Translator using Copilot API proxy server.
+    Communicates via OpenAI-compatible /v1/chat/completions endpoint.
+    """
+    LANG_NAMES = {
+        "ja": "Japanese",
+        "zh": "Chinese",
+        "ko": "Korean",
+        "en": "English",
+        "vi": "Vietnamese",
+        "th": "Thai",
+        "id": "Indonesian",
+        "fr": "French",
+        "de": "German",
+        "es": "Spanish",
+        "ru": "Russian"
+    }
+    # Available models (from Copilot API)
+    MODELS = [
+        # GPT-5 Series
+        "gpt-5",
+        "gpt-5-mini",
+        "gpt-5.1",
+        "gpt-5.1-codex",
+        "gpt-5.1-codex-mini",
+        "gpt-5.1-codex-max",
+        "gpt-5-codex",
+        # GPT-4.1 Series
+        "gpt-4.1",
+        "gpt-41-copilot",
+        # GPT-4o Series
+        "gpt-4o",
+        "gpt-4o-mini",
+        "gpt-4o-2024-11-20",
+        # GPT-4 Series
+        "gpt-4",
+        "gpt-4-0125-preview",
+        # GPT-3.5
+        "gpt-3.5-turbo",
+        # Claude Series
+        "claude-sonnet-4.5",
+        "claude-sonnet-4",
+        "claude-opus-4.5",
+        "claude-haiku-4.5",
+        # Gemini
+        "gemini-3-pro-preview",
+        "gemini-2.5-pro",
+        # Other
+        "grok-code-fast-1",
+    ]
+    def __init__(self, server_url: str = "http://localhost:8080", model: str = "gpt-4o"):
+        """
+        Initialize Copilot translator.
+        Args:
+            server_url: Copilot API proxy server URL (e.g., http://localhost:8080)
+            model: Model to use (e.g., gpt-4o, claude-3.5-sonnet)
+        """
+        self.base_url = server_url.rstrip("/")
+        self.model = model
+        self.endpoint = f"{self.base_url}/v1/chat/completions"
+    def translate_single(self, text: str, source: str = "ja", target: str = "en") -> str:
+        """Translate a single text string."""
+        if not text or not text.strip():
+            return text
+        source_name = self.LANG_NAMES.get(source, "Japanese")
+        target_name = self.LANG_NAMES.get(target, "English")
+        prompt = f"""You are an expert manga/comic translator. Translate the following {source_name} text to {target_name}.
+Rules:
+- Translate for SPOKEN dialogue, natural when read aloud
+- Preserve tone, emotion, and personality
+- For Vietnamese: use appropriate pronouns based on context
+- Return ONLY the translated text, nothing else
+Text: {text}"""
+        try:
+            response = requests.post(
+                self.endpoint,
+                json={
+                    "model": self.model,
+                    "messages": [{"role": "user", "content": prompt}],
+                    "temperature": 0.3,
+                },
+                timeout=30
+            )
+            response.raise_for_status()
+            result = response.json()
+            return result["choices"][0]["message"]["content"].strip()
+        except Exception as e:
+            print(f"Copilot translation error: {e}")
+            return text
+    def translate_batch(self, texts: List[str], source: str = "ja", target: str = "en") -> List[str]:
+        """
+        Translate multiple texts in a single API call.
+        Args:
+            texts: List of texts to translate
+            source: Source language code
+            target: Target language code
+        Returns:
+            List of translated texts (same order)
+        """
+        if not texts:
+            return []
+        # Filter empty texts
+        indexed_texts = [(i, t) for i, t in enumerate(texts) if t and t.strip()]
+        if not indexed_texts:
+            return texts
+        texts_to_translate = [t for _, t in indexed_texts]
+        source_name = self.LANG_NAMES.get(source, "Japanese")
+        target_name = self.LANG_NAMES.get(target, "English")
+        prompt = f"""You are an expert manga/comic translator. Translate the following {source_name} texts to {target_name}.
+Rules:
+- These are speech bubble texts from the SAME comic page - maintain consistency
+- Translate for SPOKEN dialogue, natural when read aloud
+- Preserve tone, emotion, and personality
+- For Vietnamese: use appropriate pronouns based on context
+- Keep short lines impactful
+Input (JSON array of texts):
+{json.dumps(texts_to_translate, ensure_ascii=False)}
+Return ONLY a JSON array with translated texts in the EXACT same order.
+Example: ["translation 1", "translation 2"]"""
+        try:
+            response = requests.post(
+                self.endpoint,
+                json={
+                    "model": self.model,
+                    "messages": [{"role": "user", "content": prompt}],
+                    "temperature": 0.3,
+                },
+                timeout=60
+            )
+            response.raise_for_status()
+            result = response.json()
+            result_text = result["choices"][0]["message"]["content"].strip()
+            # Clean up response
+            if result_text.startswith("```json"):
+                result_text = result_text[7:]
+            if result_text.startswith("```"):
+                result_text = result_text[3:]
+            if result_text.endswith("```"):
+                result_text = result_text[:-3]
+            result_text = result_text.strip()
+            translations = json.loads(result_text)
+            # Validate length
+            if len(translations) != len(texts_to_translate):
+                print(f"Warning: Expected {len(texts_to_translate)} translations, got {len(translations)}")
+                # Pad or truncate
+                while len(translations) < len(texts_to_translate):
+                    translations.append(texts_to_translate[len(translations)])
+                translations = translations[:len(texts_to_translate)]
+            # Rebuild full list
+            result_list = list(texts)
+            for (orig_idx, _), trans in zip(indexed_texts, translations):
+                result_list[orig_idx] = trans
+            return result_list
+        except Exception as e:
+            print(f"Copilot batch translation error: {e}")
+            # Fallback to single translations
+            return [self.translate_single(t, source, target) for t in texts]
+    def translate_pages_batch(
+        self,
+        pages_texts: dict,
+        source: str = "ja",
+        target: str = "en",
+        context: dict = None
+    ) -> dict:
+        """
+        Translate texts from multiple pages in a single API call.
+        Ideal for batch processing 10+ manga pages at once.
+        Args:
+            pages_texts: Dict mapping page names to list of texts
+                         e.g., {"page1": ["text1", "text2"], "page2": ["text3"]}
+            source: Source language code
+            target: Target language code
+            context: Optional dict of ALL page texts for context (helps maintain consistency)
+        Returns:
+            Dict with same structure but translated texts
+        """
+        if not pages_texts:
+            return {}
+        source_name = self.LANG_NAMES.get(source, "Japanese")
+        target_name = self.LANG_NAMES.get(target, "English")
+        # Build context section if context is provided
+        context_section = ""
+        if context and context != pages_texts:
+            # Show summary of other pages for context
+            other_pages = {k: v for k, v in context.items() if k not in pages_texts}
+            if other_pages:
+                context_preview = []
+                for page, texts in list(other_pages.items())[:5]:  # First 5 pages for context
+                    context_preview.append(f"{page}: {' | '.join(texts[:3])}...")
+                context_section = f"""
+STORY CONTEXT (from other pages in this batch - use for character/tone consistency):
+{chr(10).join(context_preview)}
+---
+"""
+        prompt = f"""You are an expert manga/comic translator. Translate the following {source_name} texts to {target_name}.
+{context_section}
+Context: These are SEQUENTIAL comic pages telling a continuous story. Maintain narrative flow and character voice consistency across all pages.
+Rules:
+- Translate for SPOKEN dialogue - it must sound natural when read aloud
+- Each character should have a consistent voice/speaking style across pages
+- Preserve tone, emotion, and personality through careful word choice
+- For Vietnamese: Choose appropriate pronouns based on character relationships
+- Keep short lines impactful. Don't pad or over-explain.
+Input (JSON - sequential pages with their speech bubbles):
+{json.dumps(pages_texts, ensure_ascii=False, indent=2)}
+IMPORTANT: Return ONLY a valid JSON object with the exact same structure but with translated texts.
+Keep page names and bubble order exactly the same. No explanations or markdown."""
+        try:
+            response = requests.post(
+                self.endpoint,
+                json={
+                    "model": self.model,
+                    "messages": [{"role": "user", "content": prompt}],
+                    "temperature": 0.3,
+                },
+                timeout=120  # Longer timeout for multi-page batch
+            )
+            response.raise_for_status()
+            result = response.json()
+            result_text = result["choices"][0]["message"]["content"].strip()
+            # Clean up response
+            if result_text.startswith("```json"):
+                result_text = result_text[7:]
+            if result_text.startswith("```"):
+                result_text = result_text[3:]
+            if result_text.endswith("```"):
+                result_text = result_text[:-3]
+            result_text = result_text.strip()
+            translated = json.loads(result_text)
+            print(f"✓ Translated {len(pages_texts)} pages in single batch")
+            return translated
+        except Exception as e:
+            print(f"Copilot pages batch translation error: {e}")
+            # Fallback: translate each page separately
+            result = {}
+            for page_name, texts in pages_texts.items():
+                result[page_name] = self.translate_batch(texts, source, target)
+            return result
+    def test_connection(self) -> bool:
+        """Test if the server is reachable."""
+        try:
+            response = requests.get(f"{self.base_url}/v1/models", timeout=5)
+            return response.status_code == 200
+        except:
+            return False
+    def get_available_models(self) -> List[str]:
+        """Get list of available models from server."""
+        try:
+            response = requests.get(f"{self.base_url}/v1/models", timeout=5)
+            if response.status_code == 200:
+                data = response.json()
+                return [m["id"] for m in data.get("data", [])]
+        except:
+            pass
+        return self.MODELS  # Return default list
+def translate_manga_pages_batch(
+    pages_texts: dict,
+    server_url: str = "http://localhost:8080",
+    model: str = "gpt-4o",
+    source_lang: str = "ja",
+    target_lang: str = "en",
+    batch_size: int = 10
+) -> dict:
+    """
+    Translate manga pages in batches.
+    Args:
+        pages_texts: All pages' texts {page_name: [texts]}
+        server_url: Copilot API server URL
+        model: Model to use
+        source_lang: Source language code
+        target_lang: Target language code
+        batch_size: Number of pages per API call (default: 10)
+    Returns:
+        All translated texts
+    """
+    translator = CopilotTranslator(server_url=server_url, model=model)
+    page_names = list(pages_texts.keys())
+    all_results = {}
+    # Process in batches
+    for i in range(0, len(page_names), batch_size):
+        batch_pages = page_names[i:i + batch_size]
+        batch_texts = {name: pages_texts[name] for name in batch_pages}
+        print(f"Translating pages {i+1} to {min(i+batch_size, len(page_names))}...")
+        batch_results = translator.translate_pages_batch(
+            batch_texts,
+            source=source_lang,
+            target=target_lang
+        )
+        all_results.update(batch_results)
+    return all_results

translator/gemini_translator.py CHANGED Viewed

@@ -6,8 +6,13 @@ Supports multiple source languages and custom prompts
 import google.generativeai as genai
 import json
 import os
 from typing import List, Dict, Optional
 class GeminiTranslator:
     """
@@ -32,13 +37,13 @@ class GeminiTranslator:
     # Preset style templates
     STYLE_PRESETS = {
         "default": "",
-        "formal": "Use formal language and polite expressions.",
-        "casual": "Use casual, friendly language like talking to friends.",
-        "keep_honorifics": "Keep Japanese honorifics like -san, -kun, -chan, -sama, senpai, sensei.",
-        "localize": "Fully localize the text, replace cultural references with equivalent ones in target language.",
-        "literal": "Translate as literally as possible while maintaining readability.",
-        "web_novel": "Use web novel translation style with dramatic expressions.",
-        "action": "Use punchy, short sentences suitable for action scenes.",
     }
     def __init__(self, api_key: str = None, custom_prompt: str = None, style: str = "default"):
@@ -97,11 +102,19 @@ class GeminiTranslator:
         style = custom_prompt or self.custom_prompt
         style_text = f"\nStyle: {style}" if style else ""
-        prompt = f"""Translate the following {source_name} comic/manga text to {target_name}.
-Keep the translation natural and suitable for comic dialogue.{style_text}
-Only return the translated text, nothing else.
-Text: {text}"""
         try:
             response = self.model.generate_content(prompt)
@@ -118,7 +131,7 @@ Text: {text}"""
         custom_prompt: str = None
     ) -> List[str]:
         """
-        Translate multiple texts in a single API call.
         Args:
             texts: List of texts to translate
@@ -138,55 +151,101 @@ Text: {text}"""
         if not indexed_texts:
             return texts
         source_name = self.LANG_NAMES.get(source, "Japanese")
         target_name = self.LANG_NAMES.get(target, "English")
-        texts_to_translate = [t for _, t in indexed_texts]
         style = custom_prompt or self.custom_prompt
         style_text = f"\nStyle instructions: {style}" if style else ""
-        prompt = f"""You are a professional comic/manga translator. Translate the following {source_name} texts to {target_name}.
-Keep translations natural and suitable for comic speech bubbles.{style_text}
-Input texts (JSON array):
 {json.dumps(texts_to_translate, ensure_ascii=False)}
-IMPORTANT: Return ONLY a JSON array with translated texts in the same order. No explanations.
-Example output format: ["translated text 1", "translated text 2", ...]"""
-        try:
-            response = self.model.generate_content(prompt)
-            result_text = response.text.strip()
-            # Clean up response if needed
-            if result_text.startswith("```json"):
-                result_text = result_text[7:]
-            if result_text.startswith("```"):
-                result_text = result_text[3:]
-            if result_text.endswith("```"):
-                result_text = result_text[:-3]
-            result_text = result_text.strip()
-            translations = json.loads(result_text)
-            # Rebuild full list with original empty strings preserved
-            result = list(texts)
-            for (orig_idx, _), trans in zip(indexed_texts, translations):
-                result[orig_idx] = trans
-            return result
-        except Exception as e:
-            print(f"Gemini batch translation error: {e}")
-            # Fallback to single translations
-            return [self.translate_single(t, source, target) for t in texts]
     def translate_pages_batch(
         self,
         pages_texts: Dict[str, List[str]],
         source: str = "ja",
         target: str = "en",
-        custom_prompt: str = None
     ) -> Dict[str, List[str]]:
         """
         Translate texts from multiple pages in a single API call.
@@ -197,6 +256,7 @@ Example output format: ["translated text 1", "translated text 2", ...]"""
             source: Source language code
             target: Target language code
             custom_prompt: Override custom prompt for this call
         Returns:
             Dict with same structure but translated texts
@@ -210,15 +270,41 @@ Example output format: ["translated text 1", "translated text 2", ...]"""
         style = custom_prompt or self.custom_prompt
         style_text = f"\nStyle instructions: {style}" if style else ""
-        prompt = f"""You are a professional comic/manga translator. Translate all {source_name} texts to {target_name}.
-Keep translations natural, conversational, and suitable for comic speech bubbles.
-Maintain the context and flow between pages as they are sequential comic pages.{style_text}
-Input (JSON - page names with their text bubbles):
 {json.dumps(pages_texts, ensure_ascii=False, indent=2)}
-IMPORTANT: Return ONLY a JSON object with the exact same structure but with translated texts.
-Keep the same page names and order. No explanations or markdown."""
         try:
             response = self.model.generate_content(prompt)

 import google.generativeai as genai
 import json
 import os
+import time
 from typing import List, Dict, Optional
+# Constants for retry logic
+MAX_RETRIES = 3
+RETRY_DELAY_BASE = 0.5  # Faster recovery: 0.5s → 1s → 2s
 class GeminiTranslator:
     """
     # Preset style templates
     STYLE_PRESETS = {
         "default": "",
+        "formal": "Use formal, polite language. Use respectful pronouns and expressions.",
+        "casual": "Use casual, natural everyday language. Like friends talking to each other.",
+        "keep_honorifics": "Keep Japanese honorifics like -san, -kun, -chan, -sama, senpai, sensei untranslated.",
+        "localize": "Fully localize cultural references. Adapt idioms and expressions to feel native.",
+        "literal": "Translate meaning accurately but ensure it still sounds natural when spoken.",
+        "web_novel": "Use dramatic web novel style with impactful expressions and emotional weight.",
+        "action": "Use short, punchy sentences. Quick pace. Impactful dialogue.",
     }
     def __init__(self, api_key: str = None, custom_prompt: str = None, style: str = "default"):
         style = custom_prompt or self.custom_prompt
         style_text = f"\nStyle: {style}" if style else ""
+        prompt = f"""You are an expert manga/comic translator specializing in {source_name} to {target_name} translation.
+Translation Guidelines:
+- Translate for SPOKEN dialogue, not written text. It should sound natural when read aloud.
+- Preserve the character's tone, emotion, and personality through word choice.
+- Use natural sentence structures in {target_name}. Avoid awkward literal translations.
+- For Vietnamese: Use appropriate pronouns (tao/mày for close friends, tôi/anh/em for normal, etc.) based on context.
+- Keep exclamations and emotional expressions feeling authentic.
+- Maintain the impact and rhythm of short/punchy lines.{style_text}
+IMPORTANT: Return ONLY the translated text. No explanations, no quotes, no formatting.
+Original text: {text}"""
         try:
             response = self.model.generate_content(prompt)
         custom_prompt: str = None
     ) -> List[str]:
         """
+        Translate multiple texts in a single API call with retry logic.
         Args:
             texts: List of texts to translate
         if not indexed_texts:
             return texts
+        texts_to_translate = [t for _, t in indexed_texts]
+        translations = self._translate_batch_internal(texts_to_translate, source, target, custom_prompt)
+        # Rebuild full list with original empty strings preserved
+        result = list(texts)
+        for (orig_idx, _), trans in zip(indexed_texts, translations):
+            result[orig_idx] = trans
+        return result
+    def _translate_batch_internal(
+        self,
+        texts_to_translate: List[str],
+        source: str,
+        target: str,
+        custom_prompt: str = None
+    ) -> List[str]:
+        """Internal method to translate a single chunk with retry logic."""
         source_name = self.LANG_NAMES.get(source, "Japanese")
         target_name = self.LANG_NAMES.get(target, "English")
         style = custom_prompt or self.custom_prompt
         style_text = f"\nStyle instructions: {style}" if style else ""
+        prompt = f"""You are an expert manga/comic translator with years of experience in {source_name} to {target_name} translation.
+Translation Guidelines:
+- These are speech bubble texts from the SAME comic page - maintain consistency in character voices.
+- Translate for SPOKEN dialogue. It must sound natural when read aloud, not stiff or robotic.
+- Preserve each character's tone, emotion, and personality through appropriate word choice.
+- Use natural {target_name} sentence structures. AVOID awkward literal word-for-word translations.
+- For Vietnamese specifically:
+  + Use appropriate pronouns based on relationship (tao/mày, tôi/cậu, anh/em, etc.)
+  + Translate exclamations naturally (くそ → Chết tiệt, やばい → Chết rồi, etc.)
+  + Keep dialogue feeling authentic to how Vietnamese people actually speak
+- Maintain the impact of short/punchy lines. Don't over-explain.
+- Keep emotional expressions and interjections feeling authentic.{style_text}
+Input texts (JSON array - each is a separate speech bubble):
 {json.dumps(texts_to_translate, ensure_ascii=False)}
+IMPORTANT: Return ONLY a valid JSON array with translated texts in the EXACT same order.
+Format: ["translation 1", "translation 2", ...]"""
+        # Retry with exponential backoff
+        for attempt in range(MAX_RETRIES):
+            try:
+                response = self.model.generate_content(prompt)
+                result_text = response.text.strip()
+                # Clean up response if needed
+                if result_text.startswith("```json"):
+                    result_text = result_text[7:]
+                if result_text.startswith("```"):
+                    result_text = result_text[3:]
+                if result_text.endswith("```"):
+                    result_text = result_text[:-3]
+                result_text = result_text.strip()
+                translations = json.loads(result_text)
+                # Validate response length
+                if len(translations) != len(texts_to_translate):
+                    raise ValueError(f"Expected {len(texts_to_translate)} translations, got {len(translations)}")
+                return translations
+            except Exception as e:
+                error_str = str(e)
+                print(f"Gemini batch attempt {attempt + 1}/{MAX_RETRIES} failed: {e}")
+                # Check if it's a quota error - don't retry or fallback
+                if "429" in error_str or "quota" in error_str.lower():
+                    print("⚠️ Quota exceeded! Returning original texts to avoid more API calls.")
+                    print("   Wait 1 minute or upgrade your Gemini API plan.")
+                    return texts_to_translate  # Return original texts
+                if attempt < MAX_RETRIES - 1:
+                    delay = RETRY_DELAY_BASE * (2 ** attempt)
+                    print(f"Retrying in {delay}s...")
+                    time.sleep(delay)
+                else:
+                    # Only fallback to single translations if NOT quota error
+                    print("All retries failed, falling back to single translations")
+                    return [self.translate_single(t, source, target) for t in texts_to_translate]
+        return texts_to_translate  # Fallback: return original
     def translate_pages_batch(
         self,
         pages_texts: Dict[str, List[str]],
         source: str = "ja",
         target: str = "en",
+        custom_prompt: str = None,
+        context: Dict[str, List[str]] = None
     ) -> Dict[str, List[str]]:
         """
         Translate texts from multiple pages in a single API call.
             source: Source language code
             target: Target language code
             custom_prompt: Override custom prompt for this call
+            context: Optional dict of ALL page texts for context (helps maintain consistency)
         Returns:
             Dict with same structure but translated texts
         style = custom_prompt or self.custom_prompt
         style_text = f"\nStyle instructions: {style}" if style else ""
+        # Build context section if context is provided
+        context_section = ""
+        if context and context != pages_texts:
+            other_pages = {k: v for k, v in context.items() if k not in pages_texts}
+            if other_pages:
+                context_preview = []
+                for page, texts in list(other_pages.items())[:5]:
+                    context_preview.append(f"{page}: {' | '.join(texts[:3])}...")
+                context_section = f"""
+STORY CONTEXT (from other pages - use for character/tone consistency):
+{chr(10).join(context_preview)}
+---
+"""
+        prompt = f"""You are an expert manga/comic translator with deep understanding of {source_name} to {target_name} translation.
+{context_section}
+Context: These are SEQUENTIAL comic pages telling a continuous story. Maintain narrative flow and character voice consistency across all pages.
+Translation Guidelines:
+- Translate for SPOKEN dialogue - it must sound natural when read aloud.
+- Each character should have a consistent voice/speaking style across pages.
+- Preserve tone, emotion, and personality through careful word choice.
+- Use natural {target_name} sentence structures. NEVER translate word-for-word literally.
+- For Vietnamese:
+  + Choose appropriate pronouns based on character relationships and social context
+  + Translate interjections and exclamations to feel authentic (not literal)
+  + Use natural Vietnamese speech patterns, not textbook Vietnamese
+- Keep short lines impactful. Don't pad or over-explain.
+- Sound effects and onomatopoeia: translate the meaning/feeling, not literally.{style_text}
+Input (JSON - sequential pages with their speech bubbles):
 {json.dumps(pages_texts, ensure_ascii=False, indent=2)}
+IMPORTANT: Return ONLY a valid JSON object with the exact same structure but with translated texts.
+Keep page names and bubble order exactly the same. No explanations or markdown."""
         try:
             response = self.model.generate_content(prompt)

translator/translator.py CHANGED Viewed

@@ -150,7 +150,9 @@ class MangaTranslator:
         try:
             if self._gemini_translator is None:
                 from .gemini_translator import GeminiTranslator
-                api_key = self.gemini_api_key or "AIzaSyAplFKOKBEcQku5m6gPEBMlZMGc4sI5rgo"
                 custom_prompt = getattr(self, '_gemini_custom_prompt', None)
                 self._gemini_translator = GeminiTranslator(
                     api_key=api_key,

         try:
             if self._gemini_translator is None:
                 from .gemini_translator import GeminiTranslator
+                api_key = getattr(self, '_gemini_api_key', None) or self.gemini_api_key
+                if not api_key:
+                    raise ValueError("Gemini API key required. Please enter it in the web form.")
                 custom_prompt = getattr(self, '_gemini_custom_prompt', None)
                 self._gemini_translator = GeminiTranslator(
                     api_key=api_key,