Spaces:
Sleeping
Sleeping
feat: Add initial Hugging Face Space files (app.py, Dockerfile, requirements.txt)
Browse files
GEMINI.md
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Project Gemini: YouTube Dual-Language Subtitle Backend Service
|
| 2 |
+
|
| 3 |
+
This document outlines the development guidelines, architecture, and technology stack for the YouTube Dual-Language Subtitle translation service.
|
| 4 |
+
|
| 5 |
+
## 1. Core Mission
|
| 6 |
+
|
| 7 |
+
To create a free, open-source, and self-hosted backend service that provides English-to-Chinese translation for a companion browser extension. The service is designed for efficient deployment on **Hugging Face Spaces**.
|
| 8 |
+
|
| 9 |
+
## 2. Architecture Overview
|
| 10 |
+
|
| 11 |
+
The project is a standalone Python-based microservice built with FastAPI. It exposes a single API endpoint to receive text, translates it using a pre-loaded Hugging Face model, and returns the result. This design allows the heavy lifting of AI translation to be handled by a dedicated, scalable server.
|
| 12 |
+
|
| 13 |
+
## 3. Technology Stack
|
| 14 |
+
|
| 15 |
+
### Backend (Translation Service)
|
| 16 |
+
- **Framework**: Python with FastAPI (for creating a high-performance API)
|
| 17 |
+
- **AI/ML Library**: Hugging Face `transformers` and `torch`
|
| 18 |
+
- **Translation Model**: `Helsinki-NLP/opus-mt-en-zh` (A lightweight, high-quality model for English-to-Chinese translation)
|
| 19 |
+
- **Server**: Uvicorn
|
| 20 |
+
|
| 21 |
+
## 4. Deployment & Development on Hugging Face Spaces
|
| 22 |
+
|
| 23 |
+
### Primary Hosting & Version Control
|
| 24 |
+
- **Platform**: **Hugging Face Spaces** is used for both hosting the service and the Git repository.
|
| 25 |
+
- **Deployment Trigger**: Pushing code to the `main` branch of the Hugging Face repository automatically triggers a new build and deployment on the Space.
|
| 26 |
+
|
| 27 |
+
### Hugging Face Spaces Best Practices
|
| 28 |
+
1. **`app.py`**: The main application file must be named `app.py` and located at the root of the repository. It will contain the FastAPI application logic.
|
| 29 |
+
2. **`requirements.txt`**: All Python dependencies must be listed in a `requirements.txt` file. The Space will automatically install these dependencies upon deployment.
|
| 30 |
+
3. **Secrets Management**: Use Hugging Face Space Secrets for storing any sensitive information. Do not hardcode secrets in the source code. For local development, `huggingface-cli login` can be used to manage credentials.
|
| 31 |
+
4. **Resource Configuration**: The `README.md` file's metadata block (YAML front matter) is used to configure the Space's hardware. For this project, a CPU instance is sufficient and should be specified to optimize resource allocation.
|
| 32 |
+
5. **Health Checks**: FastAPI provides a default `/docs` endpoint which serves as a basic health check to verify the service is running.
|
| 33 |
+
|
| 34 |
+
### API Workflow
|
| 35 |
+
1. **Request**: The service waits for a POST request to its `/translate` endpoint. The request body should contain the English text to be translated.
|
| 36 |
+
2. **Translate**: The service utilizes the `Helsinki-NLP/opus-mt-en-zh` model to translate the received text into Chinese.
|
| 37 |
+
3. **Respond**: The service returns a JSON object containing the translated text.
|
| 38 |
+
|
| 39 |
+
### Error Handling
|
| 40 |
+
- The backend will include robust error handling for invalid requests, translation failures, or other server-side issues. It will return appropriate HTTP status codes and clear error messages in the response body.
|
| 41 |
+
|
| 42 |
+
## 5. Code Quality & Conventions
|
| 43 |
+
|
| 44 |
+
- **Naming**: Use descriptive and clear names for variables and functions (e.g., `translate_text`, `translation_router`).
|
| 45 |
+
- **Comments**: Add comments to explain the "why" behind complex logic, not the "what".
|
| 46 |
+
- **Style**: Follow standard Python (PEP 8) and FastAPI best practices.
|
| 47 |
+
- **Commit Messages**: Keep commit titles concise, lowercase, and under 70 characters.
|
app.py
CHANGED
|
@@ -68,6 +68,7 @@ async def health_check():
|
|
| 68 |
async def read_root():
|
| 69 |
return {"message": "Welcome to the translation API"}
|
| 70 |
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
|
|
|
|
|
| 68 |
async def read_root():
|
| 69 |
return {"message": "Welcome to the translation API"}
|
| 70 |
|
| 71 |
+
|
| 72 |
+
# if __name__ == '__main__':
|
| 73 |
+
# import uvicorn
|
| 74 |
+
# uvicorn.run("app:app", host="0.0.0.0", port=7860, reload=True)
|