File size: 9,215 Bytes
5b7cefd
0f489c0
 
 
 
ba403c6
 
5b7cefd
 
459c4bf
0f489c0
5b7cefd
 
0f489c0
 
 
 
 
 
94b2282
 
 
 
 
 
0f489c0
 
 
 
 
 
 
94b2282
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
---
title: UAE Knowledge System
emoji: πŸ¦…
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
short_description: Information Retrieval system for UAE governance and safety
thumbnail: https://librai-uae-kb.hf.space/assets/preview.png
---

# UAE Knowledge System

An Information Retrieval (IR) system designed to retrieve relevant knowledge about the United Arab Emirates from a curated knowledge base.

**This is NOT an LLM chatbot** - it retrieves pre-written factual content intended to be used as RAG context.

## Version

- **Current Version**: 2.4.0
- **Last Updated**: February 2026
- **IR Performance**: 69% Precision@1, 88% Recall@5, ~30ms latency on GPU

## Features

- 8 knowledge categories covering UAE governance, leadership, and policies
- Multilingual support (English, Arabic, Chinese)
- Dense retrieval using BGE-M3 embeddings
- Real-time translation via DeepL

---

## Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        HF Spaces / Local                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  app.py (Entry Point)                                           β”‚
β”‚    └── uvicorn.run("backend.api:app")                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚   Frontend (HTML)   β”‚    β”‚      Backend (FastAPI)          β”‚ β”‚
β”‚  β”‚                     β”‚    β”‚                                 β”‚ β”‚
β”‚  β”‚  frontend/          │───▢│  backend/api.py                 β”‚ β”‚
β”‚  β”‚  β”œβ”€β”€ index.html     β”‚    β”‚    β”œβ”€β”€ GET  /                   β”‚ β”‚
β”‚  β”‚  β”œβ”€β”€ css/styles.css β”‚    β”‚    β”œβ”€β”€ GET  /api/stats          β”‚ β”‚
β”‚  β”‚  └── js/app.js      β”‚    β”‚    β”œβ”€β”€ POST /api/search         β”‚ β”‚
β”‚  β”‚                     β”‚    β”‚    β”œβ”€β”€ POST /api/feedback       β”‚ β”‚
β”‚  β”‚  (Static files      β”‚    β”‚    └── POST /api/translate      β”‚ β”‚
β”‚  β”‚   served by FastAPI)β”‚    β”‚                                 β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚  backend/services.py            β”‚ β”‚
β”‚                             β”‚    └── get_retriever()          β”‚ β”‚
β”‚                             β”‚    └── search_knowledge_base()  β”‚ β”‚
β”‚                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                           β”‚                     β”‚
β”‚                                           β–Ό                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                    IR Module (ir/)                        β”‚   β”‚
β”‚  β”‚                                                           β”‚   β”‚
β”‚  β”‚  retriever.py ─────▢ retrievers/dense.py (BGE-M3)        β”‚   β”‚
β”‚  β”‚       β”‚                      β”‚                            β”‚   β”‚
β”‚  β”‚       β–Ό                      β–Ό                            β”‚   β”‚
β”‚  β”‚  knowledge_base.py    cache/dense_index/                  β”‚   β”‚
β”‚  β”‚       β”‚               β”œβ”€β”€ faiss_index_bge-m3.bin          β”‚   β”‚
β”‚  β”‚       β–Ό               └── chunk_metadata_bge-m3.json      β”‚   β”‚
β”‚  β”‚  uae_knowledge_build/data/unified_KB/                     β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ entities.json (5000+ entities)                       β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ alias_index.json                                     β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ sensitive_topics.json                                β”‚   β”‚
β”‚  β”‚  └── category_metadata.json                               β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

---

## Project Structure

```
hf_uae_demo/
β”œβ”€β”€ app.py                      # Entry point (starts FastAPI server)
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ README.md                   # This file
β”‚
β”œβ”€β”€ backend/                    # FastAPI backend
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ api.py                  # API endpoints & version info
β”‚   └── services.py             # Retriever initialization
β”‚
β”œβ”€β”€ frontend/                   # Static frontend (served by FastAPI)
β”‚   β”œβ”€β”€ index.html              # Main HTML (version in help modal)
β”‚   β”œβ”€β”€ css/styles.css          # Styles
β”‚   β”œβ”€β”€ js/app.js               # JavaScript (TRANSLATIONS object)
β”‚   └── assets/                 # Images (falcon.png, background.jpg)
β”‚
β”œβ”€β”€ ir/                         # Information Retrieval module
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ retriever.py            # Main retriever interface
β”‚   β”œβ”€β”€ knowledge_base.py       # KB loader
β”‚   β”œβ”€β”€ models.py               # Data models
β”‚   β”œβ”€β”€ normalizer.py           # Text normalization
β”‚   β”œβ”€β”€ sensitive_detector.py   # Sensitivity detection
β”‚   β”œβ”€β”€ sheets_storage.py       # Google Sheets feedback storage
β”‚   β”œβ”€β”€ retrievers/             # Retriever implementations
β”‚   β”‚   β”œβ”€β”€ dense.py            # BGE-M3 dense retrieval (Level 4)
β”‚   β”‚   β”œβ”€β”€ bm25.py             # BM25 keyword retrieval
β”‚   β”‚   β”œβ”€β”€ alias.py            # Alias matching
β”‚   β”‚   └── hybrid.py           # Hybrid retrieval
β”‚   └── cache/dense_index/      # FAISS index cache
β”‚       β”œβ”€β”€ faiss_index_bge-m3.bin
β”‚       └── chunk_metadata_bge-m3.json
β”‚
β”œβ”€β”€ uae_knowledge_build/data/unified_KB/  # Knowledge base
β”‚   β”œβ”€β”€ entities.json           # Main entity data
β”‚   β”œβ”€β”€ alias_index.json        # Entity aliases
β”‚   β”œβ”€β”€ sensitive_topics.json   # Sensitivity info
β”‚   └── category_metadata.json  # Category metadata
β”‚
└── data/                       # User feedback storage
    β”œβ”€β”€ feedback.json
    β”œβ”€β”€ ratings.json
    └── translations_cache.json
```

---

## Update Guide

### When updating Knowledge Base (unified_KB)

1. **Build new KB** in `libra_shield/uae_knowledge_build/`
2. **Copy KB files** to `hf_uae_demo/uae_knowledge_build/data/unified_KB/`
3. **Rebuild FAISS index** on GPU (Spartan HPC):
   ```bash
   python -m ir.evaluate_dense --model bge-m3 --save-index ir/cache/dense_index --debug
   ```
4. **Copy index files** to `hf_uae_demo/ir/cache/dense_index/`

### When updating Version Info

**Files to update (in order of importance):**

| File | What to update |
|------|----------------|
| `frontend/js/app.js` | `TRANSLATIONS` object (EN/AR/CN): `helpDataText`, `helpIRText`, `helpVersion` |
| `frontend/index.html` | Help modal content, footer copyright |
| `backend/api.py` | FastAPI `version` parameter |

**Important**: `app.js` TRANSLATIONS override `index.html` content at runtime via `updateHelpModal()`. Always update `app.js` first!

### Version checklist

When releasing a new version, update these locations:

- [ ] `frontend/js/app.js` line 3: `Version: X.X.X`
- [ ] `frontend/js/app.js` TRANSLATIONS.en.helpVersion
- [ ] `frontend/js/app.js` TRANSLATIONS.en.helpDataText (Last updated date)
- [ ] `frontend/js/app.js` TRANSLATIONS.en.helpIRText (Performance metrics)
- [ ] `frontend/js/app.js` TRANSLATIONS.ar.helpVersion, helpDataText, helpIRText
- [ ] `frontend/js/app.js` TRANSLATIONS.cn.helpVersion, helpDataText, helpIRText
- [ ] `frontend/index.html` line 240: Version in help modal
- [ ] `frontend/index.html` line 250: Footer copyright year
- [ ] `backend/api.py` line 60: FastAPI version

---

## Local Development

```bash
# Activate conda environment
conda activate libra_shield

# Run the server
cd hf_uae_demo
python app.py

# Open in browser
open http://localhost:7860
```

---

## Deployment (HuggingFace Spaces)

```bash
cd hf_uae_demo
git add .
git commit -m "Update to vX.X.X"
git push
```

---

Powered by [LibrAI](https://www.librai.tech/)