File size: 6,804 Bytes
264f56c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
---

language:
- en
- zh
- ms
- ta
license: apache-2.0
tags:
- singapore
- sovereign-ai
- edge-ai
- meralion
- singlish
- agentic-innovation
- android
- flask
- whisper
- on-device
pipeline_tag: text-generation
library_name: custom
---


# MINA Bridge v4 β€” Sovereign Edge AI Gateway

**MINA** (My Intelligent National Assistant) is Singapore's sovereign edge AI companion, built on [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) by IMDA.

`mina-bridge` is the intelligence gateway between the MINA Android APK and the on-device MERaLiON model β€” a lightweight Flask server that handles speech transcription, rule-based agent routing, response generation, and autonomous gap logging, all running locally on a Termux environment with no cloud dependency for inference.

---

## Architecture

```

Android APK

    β”‚  base64 WAV / pre-transcribed text

    β–Ό

mina-bridge  (Flask :8081)

    β”œβ”€β”€ whisper-cli  ← speech-to-text (offline)

    β”œβ”€β”€ route_agent()  ← rule-based ARIA routing (no LLM call)

    β”œβ”€β”€ build_prompt()  ← agent-specific focused prompt

    β”œβ”€β”€ llama-server :8080  ← MERaLiON-2-3B GGUF inference

    β”œβ”€β”€ append_resources()  ← hotlines from mina_knowledge.json

    └── log_gap() + ntfy  ← autonomous cloud sync

```

**Option 3 architecture**: routing is pure Python β€” deterministic, zero-latency, zero hallucination risk. The LLM is called exactly once per turn, only to generate the response text.

---

## Features

### πŸŽ™οΈ Whisper.cpp STT Integration
Offline speech-to-text via `whisper-cli` subprocess. Accepts base64-encoded WAV from the Android APK, decodes to a temp file, runs `ggml-base.bin`, strips noise tokens (`[BLANK_AUDIO]`, `debugfs`, `MEMPROF`), and returns clean transcript text. No cloud STT dependency.

### 🧭 ARIA Agent Routing
Four specialist agents dispatched by keyword matching β€” no LLM routing call:

| Agent | Trigger keywords | Purpose |
|---|---|---|
| **VITA** | `giving up`, `want to die`, `hopeless`, `hurt myself` … | Crisis support |
| **SENTINEL** | `scam`, `bank account`, `transfer money`, `spf` … | Scam detection |
| **KRONOS** | `meeting`, `calendar`, `schedule`, `tomorrow` … | Calendar assistance |
| **MINA** | *(default)* | Stress / general emotional support |

### 🧠 Knowledge Base Integration
Reads `mina_knowledge.json` at runtime for:
- Crisis hotline numbers (SOS Lifeline, IMH) β€” phone + WhatsApp links
- Capability flags (`make_phone_call`, `send_whatsapp`, `check_calendar`, …)

Resources appended to VITA and SENTINEL replies are driven by the knowledge file, not hardcoded strings. Update the JSON to update the response β€” no code change needed.

### πŸ“‹ Gap Logging & Autonomous Learning
Every time a user requests a capability MINA doesn't yet have, `log_gap()`:
1. Appends a structured entry to `gaps/gap_log.jsonl` (local, persistent)
2. POSTs to `ntfy.sh/{NTFY_TOPIC}` for real-time cloud sync

```json

{

  "timestamp": "2026-05-02T14:23:01",

  "gap_type": "make_phone_call",

  "user_request": "can you call SOS for me",

  "context": "User requested phone call to SOS",

  "status": "pending"

}

```

The `NTFY_TOPIC` env var controls the notification channel (default: `roar-imda-demo`). Gap notifications appear in the ntfy app with tag `brain` for triage. Network failures are caught silently β€” gap is always written locally first.

### πŸ”’ Sovereign & Offline-First
All inference runs on-device. The only outbound network call is the optional ntfy gap sync (non-blocking, non-critical path). No user speech or transcript data leaves the device during inference.

---

## Endpoints

### `GET /health`
Liveness probe. Android APK polls this at startup every 3 s.
```json

{"status": "ok", "llama": true, "bridge": "v2"}

```

### `POST /completion`
Main inference endpoint. Accepts two input modes:

**Mode A β€” Pre-transcribed text** (fast path):
```json

{"transcript": "I have a meeting tomorrow morning"}

```

**Mode B β€” Raw WAV audio** (whisper path):
```json

{

  "prompt": [{

    "prompt_string": "...",

    "multimodal_data": ["<base64-WAV>"]

  }]

}

```

**Response**:
```json

{

  "reply":      "Sure lah, let me check your calendar!",

  "content":    "Sure lah, let me check your calendar!",

  "transcript": "I have a meeting tomorrow morning",

  "emotion":    "neutral",

  "valence":    0.50,

  "arousal":    0.38,

  "dominance":  0.50,

  "agent":      "KRONOS",

  "risk":       "none",

  "elapsed":    1.84

}

```

---

## Configuration

| Env var | Default | Description |
|---|---|---|
| `LLAMA_URL` | `http://localhost:8080` | llama-server endpoint |
| `BRIDGE_PORT` | `8081` | Flask listen port |
| `MAX_TOKENS` | `256` | Max tokens for transcription call |
| `NTFY_TOPIC` | `roar-imda-demo` | ntfy.sh topic for gap sync |

---

## Deployment (Termux)

```bash

# Prerequisites on device

pkg install python whisper-cpp llama-cpp



# Clone and deploy

git clone https://huggingface.co/munyew/mina-bridge

cd mina-bridge



# Start bridge (watchdog via start_mina.sh)

nohup python3 bridge.py >> bridge.log 2>&1 &



# Or restart after update

pkill -f bridge.py && sleep 3 && nohup python3 bridge.py >> bridge.log 2>&1 &

```

Expected paths on Termux:
```

~/whisper.cpp/build/bin/whisper-cli

~/whisper.cpp/models/ggml-base.bin

~/meralion/meralion-3b-decoder-q8_0.gguf

~/meralion/mina_knowledge.json

~/meralion/gaps/gap_log.jsonl          ← auto-created

```

---

## Roadmap

| Priority | Gap | Solution |
|---|---|---|
| πŸ”΄ Critical | Emotion detection upgrade | Replace VAD lookup table with [MERaLiON-SER-v1](https://huggingface.co/MERaLiON/MERaLiON-SER-v1) |
| 🟠 High | Singlish Mental Health ASR | Fine-tune MERaLiON-2-3B on v5 dataset (3240 audio files) |
| 🟠 High | Singapore Legal Domain ASR | Generate + fine-tune on CPF/HDB/PDPA domain |
| 🟑 Medium | Edge-optimised SER | Quantize MERaLiON-SER-v1 to INT8/TFLite < 200 MB |
| 🟑 Medium | Code-switched Singlish-Mandarin | Pending MNSC dataset from NUS |

---

## Citation

```bibtex

@software{mina_bridge_2026,

  title   = {MINA Bridge: Sovereign Edge AI Gateway for Singapore},

  author  = {Loh, Mun Yew (Darren)},

  year    = {2026},

  url     = {https://huggingface.co/munyew/mina-bridge},

  note    = {Singapore AI Research β€” ATxSG 2026}

}

```

---

## Acknowledgements

Built on [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) by IMDA National Multimodal LLM Programme.  
Speech transcription via [whisper.cpp](https://github.com/ggerganov/whisper.cpp).  
On-device inference via [llama.cpp](https://github.com/ggerganov/llama.cpp).