--- language: - en - zh - ms - ta license: apache-2.0 tags: - singapore - sovereign-ai - edge-ai - meralion - singlish - agentic-innovation - android - flask - whisper - on-device pipeline_tag: text-generation library_name: custom --- # MINA Bridge v4 — Sovereign Edge AI Gateway **MINA** (My Intelligent National Assistant) is Singapore's sovereign edge AI companion, built on [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) by IMDA. `mina-bridge` is the intelligence gateway between the MINA Android APK and the on-device MERaLiON model — a lightweight Flask server that handles speech transcription, rule-based agent routing, response generation, and autonomous gap logging, all running locally on a Termux environment with no cloud dependency for inference. --- ## Architecture ``` Android APK │ base64 WAV / pre-transcribed text ▼ mina-bridge (Flask :8081) ├── whisper-cli ← speech-to-text (offline) ├── route_agent() ← rule-based ARIA routing (no LLM call) ├── build_prompt() ← agent-specific focused prompt ├── llama-server :8080 ← MERaLiON-2-3B GGUF inference ├── append_resources() ← hotlines from mina_knowledge.json └── log_gap() + ntfy ← autonomous cloud sync ``` **Option 3 architecture**: routing is pure Python — deterministic, zero-latency, zero hallucination risk. The LLM is called exactly once per turn, only to generate the response text. --- ## Features ### 🎙️ Whisper.cpp STT Integration Offline speech-to-text via `whisper-cli` subprocess. Accepts base64-encoded WAV from the Android APK, decodes to a temp file, runs `ggml-base.bin`, strips noise tokens (`[BLANK_AUDIO]`, `debugfs`, `MEMPROF`), and returns clean transcript text. No cloud STT dependency. ### 🧭 ARIA Agent Routing Four specialist agents dispatched by keyword matching — no LLM routing call: | Agent | Trigger keywords | Purpose | |---|---|---| | **VITA** | `giving up`, `want to die`, `hopeless`, `hurt myself` … | Crisis support | | **SENTINEL** | `scam`, `bank account`, `transfer money`, `spf` … | Scam detection | | **KRONOS** | `meeting`, `calendar`, `schedule`, `tomorrow` … | Calendar assistance | | **MINA** | *(default)* | Stress / general emotional support | ### 🧠 Knowledge Base Integration Reads `mina_knowledge.json` at runtime for: - Crisis hotline numbers (SOS Lifeline, IMH) — phone + WhatsApp links - Capability flags (`make_phone_call`, `send_whatsapp`, `check_calendar`, …) Resources appended to VITA and SENTINEL replies are driven by the knowledge file, not hardcoded strings. Update the JSON to update the response — no code change needed. ### 📋 Gap Logging & Autonomous Learning Every time a user requests a capability MINA doesn't yet have, `log_gap()`: 1. Appends a structured entry to `gaps/gap_log.jsonl` (local, persistent) 2. POSTs to `ntfy.sh/{NTFY_TOPIC}` for real-time cloud sync ```json { "timestamp": "2026-05-02T14:23:01", "gap_type": "make_phone_call", "user_request": "can you call SOS for me", "context": "User requested phone call to SOS", "status": "pending" } ``` The `NTFY_TOPIC` env var controls the notification channel (default: `roar-imda-demo`). Gap notifications appear in the ntfy app with tag `brain` for triage. Network failures are caught silently — gap is always written locally first. ### 🔒 Sovereign & Offline-First All inference runs on-device. The only outbound network call is the optional ntfy gap sync (non-blocking, non-critical path). No user speech or transcript data leaves the device during inference. --- ## Endpoints ### `GET /health` Liveness probe. Android APK polls this at startup every 3 s. ```json {"status": "ok", "llama": true, "bridge": "v2"} ``` ### `POST /completion` Main inference endpoint. Accepts two input modes: **Mode A — Pre-transcribed text** (fast path): ```json {"transcript": "I have a meeting tomorrow morning"} ``` **Mode B — Raw WAV audio** (whisper path): ```json { "prompt": [{ "prompt_string": "...", "multimodal_data": [""] }] } ``` **Response**: ```json { "reply": "Sure lah, let me check your calendar!", "content": "Sure lah, let me check your calendar!", "transcript": "I have a meeting tomorrow morning", "emotion": "neutral", "valence": 0.50, "arousal": 0.38, "dominance": 0.50, "agent": "KRONOS", "risk": "none", "elapsed": 1.84 } ``` --- ## Configuration | Env var | Default | Description | |---|---|---| | `LLAMA_URL` | `http://localhost:8080` | llama-server endpoint | | `BRIDGE_PORT` | `8081` | Flask listen port | | `MAX_TOKENS` | `256` | Max tokens for transcription call | | `NTFY_TOPIC` | `roar-imda-demo` | ntfy.sh topic for gap sync | --- ## Deployment (Termux) ```bash # Prerequisites on device pkg install python whisper-cpp llama-cpp # Clone and deploy git clone https://huggingface.co/munyew/mina-bridge cd mina-bridge # Start bridge (watchdog via start_mina.sh) nohup python3 bridge.py >> bridge.log 2>&1 & # Or restart after update pkill -f bridge.py && sleep 3 && nohup python3 bridge.py >> bridge.log 2>&1 & ``` Expected paths on Termux: ``` ~/whisper.cpp/build/bin/whisper-cli ~/whisper.cpp/models/ggml-base.bin ~/meralion/meralion-3b-decoder-q8_0.gguf ~/meralion/mina_knowledge.json ~/meralion/gaps/gap_log.jsonl ← auto-created ``` --- ## Roadmap | Priority | Gap | Solution | |---|---|---| | 🔴 Critical | Emotion detection upgrade | Replace VAD lookup table with [MERaLiON-SER-v1](https://huggingface.co/MERaLiON/MERaLiON-SER-v1) | | 🟠 High | Singlish Mental Health ASR | Fine-tune MERaLiON-2-3B on v5 dataset (3240 audio files) | | 🟠 High | Singapore Legal Domain ASR | Generate + fine-tune on CPF/HDB/PDPA domain | | 🟡 Medium | Edge-optimised SER | Quantize MERaLiON-SER-v1 to INT8/TFLite < 200 MB | | 🟡 Medium | Code-switched Singlish-Mandarin | Pending MNSC dataset from NUS | --- ## Citation ```bibtex @software{mina_bridge_2026, title = {MINA Bridge: Sovereign Edge AI Gateway for Singapore}, author = {Loh, Mun Yew (Darren)}, year = {2026}, url = {https://huggingface.co/munyew/mina-bridge}, note = {Singapore AI Research — ATxSG 2026} } ``` --- ## Acknowledgements Built on [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) by IMDA National Multimodal LLM Programme. Speech transcription via [whisper.cpp](https://github.com/ggerganov/whisper.cpp). On-device inference via [llama.cpp](https://github.com/ggerganov/llama.cpp).