--- title: ToneBridge emoji: 🏮 colorFrom: red colorTo: yellow sdk: gradio sdk_version: 6.13.0 python_version: "3.10" app_file: app.py pinned: false short_description: A gentle Mandarin sentence coach. tags: - build-small-hackathon - chinese - mandarin - language-learning - grammar-correction - pinyin - text-to-speech - zerogpu - gradio-server - off-brand models: - Alphaplasti/ToneBridge-MiniCPM4.1-8B --- # ToneBridge - Mandarin sentence coach > Build natural Mandarin sentences, one small correction at a time. Built for the Hugging Face **Build Small Hackathon**. ## The Problem Beginner Mandarin learners often know what they want to say, but not whether the sentence sounds natural, polite, or appropriate for the social context. Classic translators tend to rewrite too much. Grammar tools often explain too much. A beginner needs something narrower: keep my meaning, fix only what is needed, show the pinyin, and tell me why in plain English. **ToneBridge is built for that moment.** You choose one context-tone profile, write or speak one Chinese sentence, and get a small, practical correction designed for learning rather than translation. ## What It Does ToneBridge returns: - one corrected Mandarin sentence; - pinyin with tone marks under Chinese text; - a short error type; - a concise explanation in English; - a practical tip for next time; - a natural Mandarin reading voice with a follow-along reading view. The correction prompt is intentionally conservative: if the sentence is already correct and natural, the corrected sentence should remain unchanged. ## How It Works 1. The learner selects one profile: **Friendly-informal**, **Work-informal**, **Work-formal**, **Wechat-informal**, or **Wechat-formal**. 2. ToneBridge applies a conservative tone-aware correction for that profile. 3. They type a Chinese sentence, or use browser speech recognition. 4. MiniCPM corrects the sentence while preserving the learner's meaning and length. 5. The frontend adds pinyin under Chinese text. 6. Edge TTS generates a fast Mandarin Neural reading of the corrected sentence. 7. The reading panel highlights characters while the audio plays. ## What's Inside | Component | Model / Library | Where it runs | | --- | --- | --- | | Sentence correction | **Alphaplasti/ToneBridge-MiniCPM4.1-8B** via `transformers` | ZeroGPU / GPU-backed Space | | Mandarin reading voice | `edge-tts` with `zh-CN-YunjianNeural` by default | Server | | Pinyin | `pypinyin` with tone marks | CPU | | Voice input | Browser Web Speech API | Browser-dependent | | Frontend | Custom HTML/CSS/JS served by `gr.Server` | Browser | | Backend API | `gr.Server` + `@app.api()` endpoints | Hugging Face Space | The active model pipeline stays under the 32B-parameter target: the main correction model is 8B, and the default reading voice uses a lightweight server-side Edge TTS call instead of loading a second GPU model. ## Hardware And Loading ToneBridge is designed for Hugging Face ZeroGPU. - The correction model is preloaded at Space startup so it is not reloaded on every correction. - The reading voice uses Edge TTS by default, so replay avoids loading a heavy server-side TTS model. ## Space Variables Useful environment variables: ```text MODEL_ID=Alphaplasti/ToneBridge-MiniCPM4.1-8B TTS_PROVIDER=edge ENABLE_SERVER_TTS=true EDGE_TTS_VOICE=zh-CN-YunjianNeural EDGE_TTS_RATE=+0% EDGE_TTS_PITCH=+0Hz EDGE_TTS_VOLUME=+0% EDGE_TTS_KARAOKE_DURATION_FACTOR=0.86 LOAD_IN_4BIT=true PRELOAD_MODEL=true MAX_INPUT_CHARS=1200 MAX_NEW_TOKENS=220 METRICS_FILE=tonebridge_usage_metrics.jsonl METRICS_REPO_SYNC=false METRICS_REPO_ID=build-small-hackathon/Tone-Bridge METRICS_REPO_PATH=tonebridge_usage_metrics.jsonl ``` If the correction model is private, add `HF_TOKEN` as a Space secret with read access to `Alphaplasti/ToneBridge-MiniCPM4.1-8B`. ## Usage Metrics And Feedback Every saved correction is written to `tonebridge_usage_metrics.jsonl` in the running Space app folder by default. Relative `METRICS_FILE` values are resolved from the folder that contains `app.py`. The Hugging Face **Files** tab shows the Space git repository, not every runtime file created while the app is running. To make the metrics file appear in **Files**, enable repo sync: ```text METRICS_REPO_SYNC=true METRICS_REPO_ID=build-small-hackathon/Tone-Bridge METRICS_REPO_PATH=tonebridge_usage_metrics.jsonl HF_METRICS_TOKEN= ``` If the Space is public, synced metrics are public too. Each record includes: - `original_sentence` - `corrected_sentence` - `evaluation` (`thumbs_up`, `thumbs_down`, or `null`) - `generation_time_seconds` - status, context, tone, correction mode, error type, model id, timestamp, and request id The app adds thumbs-up / thumbs-down buttons after each correction. Votes update the matching JSONL record automatically. MiniCPM4.1 currently expects a Transformers 4.x runtime. `requirements.txt` pins `transformers>=4.56.0,<5.0.0` because Transformers 5 removed an internal helper still imported by the model's remote code. ## Repository ```text . ├── app.py # gr.Server app, API endpoints, frontend, correction and TTS logic ├── requirements.txt # Python dependencies └── README.md ``` ## Hackathon Fit - **Off-brand UI:** the app uses a custom `gr.Server` frontend instead of default Gradio Blocks. - **Small, focused product:** one clear learning job, one correction at a time. - **Codex track ready:** the code should be pushed to a public GitHub repository with Codex-attributed commits, then linked from this Space README. ## Credits - **Alphaplasti** - ToneBridge-MiniCPM4.1-8B for sentence correction. - **OpenBMB** - MiniCPM4.1-8B base model. - **Hugging Face Spaces** - hosting, ZeroGPU, and the Build Small Hackathon. - **Gradio** - `gr.Server`, queue/API infrastructure, and client compatibility. --- *ToneBridge - gentle Mandarin correction for real learners.*