Spaces:

build-small-hackathon
/

Tone-Bridge

Running on Zero

App Files Files Community

Tone-Bridge / README.md

Alphaplasti

Upload 3 files

c03e26c verified about 18 hours ago

preview code

raw

history blame contribute delete

5.98 kB

A newer version of the Gradio SDK is available: 6.18.0

Upgrade

metadata

title: ToneBridge
emoji: 🏮
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 6.13.0
python_version: '3.10'
app_file: app.py
pinned: false
short_description: A gentle Mandarin sentence coach.
tags:
  - build-small-hackathon
  - chinese
  - mandarin
  - language-learning
  - grammar-correction
  - pinyin
  - text-to-speech
  - zerogpu
  - gradio-server
  - off-brand
models:
  - Alphaplasti/ToneBridge-MiniCPM4.1-8B

ToneBridge - Mandarin sentence coach

Build natural Mandarin sentences, one small correction at a time.

Built for the Hugging Face Build Small Hackathon.

The Problem

Beginner Mandarin learners often know what they want to say, but not whether the sentence sounds natural, polite, or appropriate for the social context.

Classic translators tend to rewrite too much. Grammar tools often explain too much. A beginner needs something narrower: keep my meaning, fix only what is needed, show the pinyin, and tell me why in plain English.

ToneBridge is built for that moment. You choose one context-tone profile, write or speak one Chinese sentence, and get a small, practical correction designed for learning rather than translation.

What It Does

ToneBridge returns:

one corrected Mandarin sentence;
pinyin with tone marks under Chinese text;
a short error type;
a concise explanation in English;
a practical tip for next time;
a natural Mandarin reading voice with a follow-along reading view.

The correction prompt is intentionally conservative: if the sentence is already correct and natural, the corrected sentence should remain unchanged.

How It Works

The learner selects one profile: Friendly-informal, Work-informal, Work-formal, Wechat-informal, or Wechat-formal.
ToneBridge applies a conservative tone-aware correction for that profile.
They type a Chinese sentence, or use browser speech recognition.
MiniCPM corrects the sentence while preserving the learner's meaning and length.
The frontend adds pinyin under Chinese text.
Edge TTS generates a fast Mandarin Neural reading of the corrected sentence.
The reading panel highlights characters while the audio plays.

What's Inside

Component	Model / Library	Where it runs
Sentence correction	Alphaplasti/ToneBridge-MiniCPM4.1-8B via `transformers`	ZeroGPU / GPU-backed Space
Mandarin reading voice	`edge-tts` with `zh-CN-YunjianNeural` by default	Server
Pinyin	`pypinyin` with tone marks	CPU
Voice input	Browser Web Speech API	Browser-dependent
Frontend	Custom HTML/CSS/JS served by `gr.Server`	Browser
Backend API	`gr.Server` + `@app.api()` endpoints	Hugging Face Space

The active model pipeline stays under the 32B-parameter target: the main correction model is 8B, and the default reading voice uses a lightweight server-side Edge TTS call instead of loading a second GPU model.

Hardware And Loading

ToneBridge is designed for Hugging Face ZeroGPU.

The correction model is preloaded at Space startup so it is not reloaded on every correction.
The reading voice uses Edge TTS by default, so replay avoids loading a heavy server-side TTS model.

Space Variables

Useful environment variables:

MODEL_ID=Alphaplasti/ToneBridge-MiniCPM4.1-8B
TTS_PROVIDER=edge
ENABLE_SERVER_TTS=true
EDGE_TTS_VOICE=zh-CN-YunjianNeural
EDGE_TTS_RATE=+0%
EDGE_TTS_PITCH=+0Hz
EDGE_TTS_VOLUME=+0%
EDGE_TTS_KARAOKE_DURATION_FACTOR=0.86
LOAD_IN_4BIT=true
PRELOAD_MODEL=true
MAX_INPUT_CHARS=1200
MAX_NEW_TOKENS=220
METRICS_FILE=tonebridge_usage_metrics.jsonl
METRICS_REPO_SYNC=false
METRICS_REPO_ID=build-small-hackathon/Tone-Bridge
METRICS_REPO_PATH=tonebridge_usage_metrics.jsonl

If the correction model is private, add HF_TOKEN as a Space secret with read access to Alphaplasti/ToneBridge-MiniCPM4.1-8B.

Usage Metrics And Feedback

Every saved correction is written to tonebridge_usage_metrics.jsonl in the running Space app folder by default. Relative METRICS_FILE values are resolved from the folder that contains app.py.

The Hugging Face Files tab shows the Space git repository, not every runtime file created while the app is running. To make the metrics file appear in Files, enable repo sync:

METRICS_REPO_SYNC=true
METRICS_REPO_ID=build-small-hackathon/Tone-Bridge
METRICS_REPO_PATH=tonebridge_usage_metrics.jsonl
HF_METRICS_TOKEN=<write token as a Space secret>

If the Space is public, synced metrics are public too. Each record includes:

original_sentence
corrected_sentence
evaluation (thumbs_up, thumbs_down, or null)
generation_time_seconds
status, context, tone, correction mode, error type, model id, timestamp, and request id

The app adds thumbs-up / thumbs-down buttons after each correction. Votes update the matching JSONL record automatically.

MiniCPM4.1 currently expects a Transformers 4.x runtime. requirements.txt pins transformers>=4.56.0,<5.0.0 because Transformers 5 removed an internal helper still imported by the model's remote code.

Repository

.
├── app.py            # gr.Server app, API endpoints, frontend, correction and TTS logic
├── requirements.txt  # Python dependencies
└── README.md

Hackathon Fit

Off-brand UI: the app uses a custom gr.Server frontend instead of default Gradio Blocks.
Small, focused product: one clear learning job, one correction at a time.
Codex track ready: the code should be pushed to a public GitHub repository with Codex-attributed commits, then linked from this Space README.

Credits

Alphaplasti - ToneBridge-MiniCPM4.1-8B for sentence correction.
OpenBMB - MiniCPM4.1-8B base model.
Hugging Face Spaces - hosting, ZeroGPU, and the Build Small Hackathon.
Gradio - gr.Server, queue/API infrastructure, and client compatibility.

ToneBridge - gentle Mandarin correction for real learners.