Spaces:
Running on Zero
Running on Zero
| title: ToneBridge | |
| emoji: ๐ฎ | |
| colorFrom: red | |
| colorTo: yellow | |
| sdk: gradio | |
| sdk_version: 6.13.0 | |
| python_version: "3.10" | |
| app_file: app.py | |
| pinned: false | |
| short_description: A gentle Mandarin sentence coach. | |
| tags: | |
| - build-small-hackathon | |
| - chinese | |
| - mandarin | |
| - language-learning | |
| - grammar-correction | |
| - pinyin | |
| - text-to-speech | |
| - zerogpu | |
| - gradio-server | |
| - off-brand | |
| models: | |
| - Alphaplasti/ToneBridge-MiniCPM4.1-8B | |
| # ToneBridge - Mandarin sentence coach | |
| > Build natural Mandarin sentences, one small correction at a time. | |
| Built for the Hugging Face **Build Small Hackathon**. | |
| ## The Problem | |
| Beginner Mandarin learners often know what they want to say, but not whether the sentence sounds natural, polite, or appropriate for the social context. | |
| Classic translators tend to rewrite too much. Grammar tools often explain too much. A beginner needs something narrower: keep my meaning, fix only what is needed, show the pinyin, and tell me why in plain English. | |
| **ToneBridge is built for that moment.** You choose one context-tone profile, write or speak one Chinese sentence, and get a small, practical correction designed for learning rather than translation. | |
| ## What It Does | |
| ToneBridge returns: | |
| - one corrected Mandarin sentence; | |
| - pinyin with tone marks under Chinese text; | |
| - a short error type; | |
| - a concise explanation in English; | |
| - a practical tip for next time; | |
| - a natural Mandarin reading voice with a follow-along reading view. | |
| The correction prompt is intentionally conservative: if the sentence is already correct and natural, the corrected sentence should remain unchanged. | |
| ## How It Works | |
| 1. The learner selects one profile: **Friendly-informal**, **Work-informal**, **Work-formal**, **Wechat-informal**, or **Wechat-formal**. | |
| 2. ToneBridge applies a conservative tone-aware correction for that profile. | |
| 3. They type a Chinese sentence, or use browser speech recognition. | |
| 4. MiniCPM corrects the sentence while preserving the learner's meaning and length. | |
| 5. The frontend adds pinyin under Chinese text. | |
| 6. Edge TTS generates a fast Mandarin Neural reading of the corrected sentence. | |
| 7. The reading panel highlights characters while the audio plays. | |
| ## What's Inside | |
| | Component | Model / Library | Where it runs | | |
| | --- | --- | --- | | |
| | Sentence correction | **Alphaplasti/ToneBridge-MiniCPM4.1-8B** via `transformers` | ZeroGPU / GPU-backed Space | | |
| | Mandarin reading voice | `edge-tts` with `zh-CN-YunjianNeural` by default | Server | | |
| | Pinyin | `pypinyin` with tone marks | CPU | | |
| | Voice input | Browser Web Speech API | Browser-dependent | | |
| | Frontend | Custom HTML/CSS/JS served by `gr.Server` | Browser | | |
| | Backend API | `gr.Server` + `@app.api()` endpoints | Hugging Face Space | | |
| The active model pipeline stays under the 32B-parameter target: the main correction model is 8B, and the default reading voice uses a lightweight server-side Edge TTS call instead of loading a second GPU model. | |
| ## Hardware And Loading | |
| ToneBridge is designed for Hugging Face ZeroGPU. | |
| - The correction model is preloaded at Space startup so it is not reloaded on every correction. | |
| - The reading voice uses Edge TTS by default, so replay avoids loading a heavy server-side TTS model. | |
| ## Space Variables | |
| Useful environment variables: | |
| ```text | |
| MODEL_ID=Alphaplasti/ToneBridge-MiniCPM4.1-8B | |
| TTS_PROVIDER=edge | |
| ENABLE_SERVER_TTS=true | |
| EDGE_TTS_VOICE=zh-CN-YunjianNeural | |
| EDGE_TTS_RATE=+0% | |
| EDGE_TTS_PITCH=+0Hz | |
| EDGE_TTS_VOLUME=+0% | |
| EDGE_TTS_KARAOKE_DURATION_FACTOR=0.86 | |
| LOAD_IN_4BIT=true | |
| PRELOAD_MODEL=true | |
| MAX_INPUT_CHARS=1200 | |
| MAX_NEW_TOKENS=220 | |
| METRICS_FILE=tonebridge_usage_metrics.jsonl | |
| METRICS_REPO_SYNC=false | |
| METRICS_REPO_ID=build-small-hackathon/Tone-Bridge | |
| METRICS_REPO_PATH=tonebridge_usage_metrics.jsonl | |
| ``` | |
| If the correction model is private, add `HF_TOKEN` as a Space secret with read access to `Alphaplasti/ToneBridge-MiniCPM4.1-8B`. | |
| ## Usage Metrics And Feedback | |
| Every saved correction is written to `tonebridge_usage_metrics.jsonl` in the running Space app folder by default. Relative `METRICS_FILE` values are resolved from the folder that contains `app.py`. | |
| The Hugging Face **Files** tab shows the Space git repository, not every runtime file created while the app is running. To make the metrics file appear in **Files**, enable repo sync: | |
| ```text | |
| METRICS_REPO_SYNC=true | |
| METRICS_REPO_ID=build-small-hackathon/Tone-Bridge | |
| METRICS_REPO_PATH=tonebridge_usage_metrics.jsonl | |
| HF_METRICS_TOKEN=<write token as a Space secret> | |
| ``` | |
| If the Space is public, synced metrics are public too. Each record includes: | |
| - `original_sentence` | |
| - `corrected_sentence` | |
| - `evaluation` (`thumbs_up`, `thumbs_down`, or `null`) | |
| - `generation_time_seconds` | |
| - status, context, tone, correction mode, error type, model id, timestamp, and request id | |
| The app adds thumbs-up / thumbs-down buttons after each correction. Votes update the matching JSONL record automatically. | |
| MiniCPM4.1 currently expects a Transformers 4.x runtime. `requirements.txt` pins `transformers>=4.56.0,<5.0.0` because Transformers 5 removed an internal helper still imported by the model's remote code. | |
| ## Repository | |
| ```text | |
| . | |
| โโโ app.py # gr.Server app, API endpoints, frontend, correction and TTS logic | |
| โโโ requirements.txt # Python dependencies | |
| โโโ README.md | |
| ``` | |
| ## Hackathon Fit | |
| - **Off-brand UI:** the app uses a custom `gr.Server` frontend instead of default Gradio Blocks. | |
| - **Small, focused product:** one clear learning job, one correction at a time. | |
| - **Codex track ready:** the code should be pushed to a public GitHub repository with Codex-attributed commits, then linked from this Space README. | |
| ## Credits | |
| - **Alphaplasti** - ToneBridge-MiniCPM4.1-8B for sentence correction. | |
| - **OpenBMB** - MiniCPM4.1-8B base model. | |
| - **Hugging Face Spaces** - hosting, ZeroGPU, and the Build Small Hackathon. | |
| - **Gradio** - `gr.Server`, queue/API infrastructure, and client compatibility. | |
| --- | |
| *ToneBridge - gentle Mandarin correction for real learners.* | |