Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,207 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- zh
|
| 5 |
+
- ms
|
| 6 |
+
- ta
|
| 7 |
+
license: apache-2.0
|
| 8 |
+
tags:
|
| 9 |
+
- singapore
|
| 10 |
+
- sovereign-ai
|
| 11 |
+
- edge-ai
|
| 12 |
+
- meralion
|
| 13 |
+
- singlish
|
| 14 |
+
- agentic-innovation
|
| 15 |
+
- android
|
| 16 |
+
- flask
|
| 17 |
+
- whisper
|
| 18 |
+
- on-device
|
| 19 |
+
pipeline_tag: text-generation
|
| 20 |
+
library_name: custom
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
# MINA Bridge v4 β Sovereign Edge AI Gateway
|
| 24 |
+
|
| 25 |
+
**MINA** (My Intelligent National Assistant) is Singapore's sovereign edge AI companion, built on [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) by IMDA.
|
| 26 |
+
|
| 27 |
+
`mina-bridge` is the intelligence gateway between the MINA Android APK and the on-device MERaLiON model β a lightweight Flask server that handles speech transcription, rule-based agent routing, response generation, and autonomous gap logging, all running locally on a Termux environment with no cloud dependency for inference.
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## Architecture
|
| 32 |
+
|
| 33 |
+
```
|
| 34 |
+
Android APK
|
| 35 |
+
β base64 WAV / pre-transcribed text
|
| 36 |
+
βΌ
|
| 37 |
+
mina-bridge (Flask :8081)
|
| 38 |
+
βββ whisper-cli β speech-to-text (offline)
|
| 39 |
+
βββ route_agent() β rule-based ARIA routing (no LLM call)
|
| 40 |
+
βββ build_prompt() β agent-specific focused prompt
|
| 41 |
+
βββ llama-server :8080 β MERaLiON-2-3B GGUF inference
|
| 42 |
+
βββ append_resources() β hotlines from mina_knowledge.json
|
| 43 |
+
βββ log_gap() + ntfy β autonomous cloud sync
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
**Option 3 architecture**: routing is pure Python β deterministic, zero-latency, zero hallucination risk. The LLM is called exactly once per turn, only to generate the response text.
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
## Features
|
| 51 |
+
|
| 52 |
+
### ποΈ Whisper.cpp STT Integration
|
| 53 |
+
Offline speech-to-text via `whisper-cli` subprocess. Accepts base64-encoded WAV from the Android APK, decodes to a temp file, runs `ggml-base.bin`, strips noise tokens (`[BLANK_AUDIO]`, `debugfs`, `MEMPROF`), and returns clean transcript text. No cloud STT dependency.
|
| 54 |
+
|
| 55 |
+
### π§ ARIA Agent Routing
|
| 56 |
+
Four specialist agents dispatched by keyword matching β no LLM routing call:
|
| 57 |
+
|
| 58 |
+
| Agent | Trigger keywords | Purpose |
|
| 59 |
+
|---|---|---|
|
| 60 |
+
| **VITA** | `giving up`, `want to die`, `hopeless`, `hurt myself` β¦ | Crisis support |
|
| 61 |
+
| **SENTINEL** | `scam`, `bank account`, `transfer money`, `spf` β¦ | Scam detection |
|
| 62 |
+
| **KRONOS** | `meeting`, `calendar`, `schedule`, `tomorrow` β¦ | Calendar assistance |
|
| 63 |
+
| **MINA** | *(default)* | Stress / general emotional support |
|
| 64 |
+
|
| 65 |
+
### π§ Knowledge Base Integration
|
| 66 |
+
Reads `mina_knowledge.json` at runtime for:
|
| 67 |
+
- Crisis hotline numbers (SOS Lifeline, IMH) β phone + WhatsApp links
|
| 68 |
+
- Capability flags (`make_phone_call`, `send_whatsapp`, `check_calendar`, β¦)
|
| 69 |
+
|
| 70 |
+
Resources appended to VITA and SENTINEL replies are driven by the knowledge file, not hardcoded strings. Update the JSON to update the response β no code change needed.
|
| 71 |
+
|
| 72 |
+
### π Gap Logging & Autonomous Learning
|
| 73 |
+
Every time a user requests a capability MINA doesn't yet have, `log_gap()`:
|
| 74 |
+
1. Appends a structured entry to `gaps/gap_log.jsonl` (local, persistent)
|
| 75 |
+
2. POSTs to `ntfy.sh/{NTFY_TOPIC}` for real-time cloud sync
|
| 76 |
+
|
| 77 |
+
```json
|
| 78 |
+
{
|
| 79 |
+
"timestamp": "2026-05-02T14:23:01",
|
| 80 |
+
"gap_type": "make_phone_call",
|
| 81 |
+
"user_request": "can you call SOS for me",
|
| 82 |
+
"context": "User requested phone call to SOS",
|
| 83 |
+
"status": "pending"
|
| 84 |
+
}
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
The `NTFY_TOPIC` env var controls the notification channel (default: `roar-imda-demo`). Gap notifications appear in the ntfy app with tag `brain` for triage. Network failures are caught silently β gap is always written locally first.
|
| 88 |
+
|
| 89 |
+
### π Sovereign & Offline-First
|
| 90 |
+
All inference runs on-device. The only outbound network call is the optional ntfy gap sync (non-blocking, non-critical path). No user speech or transcript data leaves the device during inference.
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## Endpoints
|
| 95 |
+
|
| 96 |
+
### `GET /health`
|
| 97 |
+
Liveness probe. Android APK polls this at startup every 3 s.
|
| 98 |
+
```json
|
| 99 |
+
{"status": "ok", "llama": true, "bridge": "v2"}
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
### `POST /completion`
|
| 103 |
+
Main inference endpoint. Accepts two input modes:
|
| 104 |
+
|
| 105 |
+
**Mode A β Pre-transcribed text** (fast path):
|
| 106 |
+
```json
|
| 107 |
+
{"transcript": "I have a meeting tomorrow morning"}
|
| 108 |
+
```
|
| 109 |
+
|
| 110 |
+
**Mode B β Raw WAV audio** (whisper path):
|
| 111 |
+
```json
|
| 112 |
+
{
|
| 113 |
+
"prompt": [{
|
| 114 |
+
"prompt_string": "...",
|
| 115 |
+
"multimodal_data": ["<base64-WAV>"]
|
| 116 |
+
}]
|
| 117 |
+
}
|
| 118 |
+
```
|
| 119 |
+
|
| 120 |
+
**Response**:
|
| 121 |
+
```json
|
| 122 |
+
{
|
| 123 |
+
"reply": "Sure lah, let me check your calendar!",
|
| 124 |
+
"content": "Sure lah, let me check your calendar!",
|
| 125 |
+
"transcript": "I have a meeting tomorrow morning",
|
| 126 |
+
"emotion": "neutral",
|
| 127 |
+
"valence": 0.50,
|
| 128 |
+
"arousal": 0.38,
|
| 129 |
+
"dominance": 0.50,
|
| 130 |
+
"agent": "KRONOS",
|
| 131 |
+
"risk": "none",
|
| 132 |
+
"elapsed": 1.84
|
| 133 |
+
}
|
| 134 |
+
```
|
| 135 |
+
|
| 136 |
+
---
|
| 137 |
+
|
| 138 |
+
## Configuration
|
| 139 |
+
|
| 140 |
+
| Env var | Default | Description |
|
| 141 |
+
|---|---|---|
|
| 142 |
+
| `LLAMA_URL` | `http://localhost:8080` | llama-server endpoint |
|
| 143 |
+
| `BRIDGE_PORT` | `8081` | Flask listen port |
|
| 144 |
+
| `MAX_TOKENS` | `256` | Max tokens for transcription call |
|
| 145 |
+
| `NTFY_TOPIC` | `roar-imda-demo` | ntfy.sh topic for gap sync |
|
| 146 |
+
|
| 147 |
+
---
|
| 148 |
+
|
| 149 |
+
## Deployment (Termux)
|
| 150 |
+
|
| 151 |
+
```bash
|
| 152 |
+
# Prerequisites on device
|
| 153 |
+
pkg install python whisper-cpp llama-cpp
|
| 154 |
+
|
| 155 |
+
# Clone and deploy
|
| 156 |
+
git clone https://huggingface.co/munyew/mina-bridge
|
| 157 |
+
cd mina-bridge
|
| 158 |
+
|
| 159 |
+
# Start bridge (watchdog via start_mina.sh)
|
| 160 |
+
nohup python3 bridge.py >> bridge.log 2>&1 &
|
| 161 |
+
|
| 162 |
+
# Or restart after update
|
| 163 |
+
pkill -f bridge.py && sleep 3 && nohup python3 bridge.py >> bridge.log 2>&1 &
|
| 164 |
+
```
|
| 165 |
+
|
| 166 |
+
Expected paths on Termux:
|
| 167 |
+
```
|
| 168 |
+
~/whisper.cpp/build/bin/whisper-cli
|
| 169 |
+
~/whisper.cpp/models/ggml-base.bin
|
| 170 |
+
~/meralion/meralion-3b-decoder-q8_0.gguf
|
| 171 |
+
~/meralion/mina_knowledge.json
|
| 172 |
+
~/meralion/gaps/gap_log.jsonl β auto-created
|
| 173 |
+
```
|
| 174 |
+
|
| 175 |
+
---
|
| 176 |
+
|
| 177 |
+
## Roadmap
|
| 178 |
+
|
| 179 |
+
| Priority | Gap | Solution |
|
| 180 |
+
|---|---|---|
|
| 181 |
+
| π΄ Critical | Emotion detection upgrade | Replace VAD lookup table with [MERaLiON-SER-v1](https://huggingface.co/MERaLiON/MERaLiON-SER-v1) |
|
| 182 |
+
| π High | Singlish Mental Health ASR | Fine-tune MERaLiON-2-3B on v5 dataset (3240 audio files) |
|
| 183 |
+
| π High | Singapore Legal Domain ASR | Generate + fine-tune on CPF/HDB/PDPA domain |
|
| 184 |
+
| π‘ Medium | Edge-optimised SER | Quantize MERaLiON-SER-v1 to INT8/TFLite < 200 MB |
|
| 185 |
+
| π‘ Medium | Code-switched Singlish-Mandarin | Pending MNSC dataset from IMDA/NUS |
|
| 186 |
+
|
| 187 |
+
---
|
| 188 |
+
|
| 189 |
+
## Citation
|
| 190 |
+
|
| 191 |
+
```bibtex
|
| 192 |
+
@software{mina_bridge_2026,
|
| 193 |
+
title = {MINA Bridge: Sovereign Edge AI Gateway for Singapore},
|
| 194 |
+
author = {Loh, Mun Yew (Darren)},
|
| 195 |
+
year = {2026},
|
| 196 |
+
url = {https://huggingface.co/munyew/mina-bridge},
|
| 197 |
+
note = {IMDA NMLP β ATxSG 2026}
|
| 198 |
+
}
|
| 199 |
+
```
|
| 200 |
+
|
| 201 |
+
---
|
| 202 |
+
|
| 203 |
+
## Acknowledgements
|
| 204 |
+
|
| 205 |
+
Built on [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) by IMDA National Multimodal LLM Programme.
|
| 206 |
+
Speech transcription via [whisper.cpp](https://github.com/ggerganov/whisper.cpp).
|
| 207 |
+
On-device inference via [llama.cpp](https://github.com/ggerganov/llama.cpp).
|