TransPlugin / GEMINI.md
angre369's picture
feat: Add initial Hugging Face Space files (app.py, Dockerfile, requirements.txt)
615441e

Project Gemini: YouTube Dual-Language Subtitle Backend Service

This document outlines the development guidelines, architecture, and technology stack for the YouTube Dual-Language Subtitle translation service.

1. Core Mission

To create a free, open-source, and self-hosted backend service that provides English-to-Chinese translation for a companion browser extension. The service is designed for efficient deployment on Hugging Face Spaces.

2. Architecture Overview

The project is a standalone Python-based microservice built with FastAPI. It exposes a single API endpoint to receive text, translates it using a pre-loaded Hugging Face model, and returns the result. This design allows the heavy lifting of AI translation to be handled by a dedicated, scalable server.

3. Technology Stack

Backend (Translation Service)

  • Framework: Python with FastAPI (for creating a high-performance API)
  • AI/ML Library: Hugging Face transformers and torch
  • Translation Model: Helsinki-NLP/opus-mt-en-zh (A lightweight, high-quality model for English-to-Chinese translation)
  • Server: Uvicorn

4. Deployment & Development on Hugging Face Spaces

Primary Hosting & Version Control

  • Platform: Hugging Face Spaces is used for both hosting the service and the Git repository.
  • Deployment Trigger: Pushing code to the main branch of the Hugging Face repository automatically triggers a new build and deployment on the Space.

Hugging Face Spaces Best Practices

  1. app.py: The main application file must be named app.py and located at the root of the repository. It will contain the FastAPI application logic.
  2. requirements.txt: All Python dependencies must be listed in a requirements.txt file. The Space will automatically install these dependencies upon deployment.
  3. Secrets Management: Use Hugging Face Space Secrets for storing any sensitive information. Do not hardcode secrets in the source code. For local development, huggingface-cli login can be used to manage credentials.
  4. Resource Configuration: The README.md file's metadata block (YAML front matter) is used to configure the Space's hardware. For this project, a CPU instance is sufficient and should be specified to optimize resource allocation.
  5. Health Checks: FastAPI provides a default /docs endpoint which serves as a basic health check to verify the service is running.

API Workflow

  1. Request: The service waits for a POST request to its /translate endpoint. The request body should contain the English text to be translated.
  2. Translate: The service utilizes the Helsinki-NLP/opus-mt-en-zh model to translate the received text into Chinese.
  3. Respond: The service returns a JSON object containing the translated text.

Error Handling

  • The backend will include robust error handling for invalid requests, translation failures, or other server-side issues. It will return appropriate HTTP status codes and clear error messages in the response body.

5. Code Quality & Conventions

  • Naming: Use descriptive and clear names for variables and functions (e.g., translate_text, translation_router).
  • Comments: Add comments to explain the "why" behind complex logic, not the "what".
  • Style: Follow standard Python (PEP 8) and FastAPI best practices.
  • Commit Messages: Keep commit titles concise, lowercase, and under 70 characters.