uae-kb / README.md
jinruiyang
Fix HF Spaces SDK config
ba403c6
---
title: UAE Knowledge System
emoji: πŸ¦…
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
short_description: Information Retrieval system for UAE governance and safety
thumbnail: https://librai-uae-kb.hf.space/assets/preview.png
---
# UAE Knowledge System
An Information Retrieval (IR) system designed to retrieve relevant knowledge about the United Arab Emirates from a curated knowledge base.
**This is NOT an LLM chatbot** - it retrieves pre-written factual content intended to be used as RAG context.
## Version
- **Current Version**: 2.4.0
- **Last Updated**: February 2026
- **IR Performance**: 69% Precision@1, 88% Recall@5, ~30ms latency on GPU
## Features
- 8 knowledge categories covering UAE governance, leadership, and policies
- Multilingual support (English, Arabic, Chinese)
- Dense retrieval using BGE-M3 embeddings
- Real-time translation via DeepL
---
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ HF Spaces / Local β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ app.py (Entry Point) β”‚
β”‚ └── uvicorn.run("backend.api:app") β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Frontend (HTML) β”‚ β”‚ Backend (FastAPI) β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ frontend/ │───▢│ backend/api.py β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ index.html β”‚ β”‚ β”œβ”€β”€ GET / β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ css/styles.css β”‚ β”‚ β”œβ”€β”€ GET /api/stats β”‚ β”‚
β”‚ β”‚ └── js/app.js β”‚ β”‚ β”œβ”€β”€ POST /api/search β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ POST /api/feedback β”‚ β”‚
β”‚ β”‚ (Static files β”‚ β”‚ └── POST /api/translate β”‚ β”‚
β”‚ β”‚ served by FastAPI)β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ backend/services.py β”‚ β”‚
β”‚ β”‚ └── get_retriever() β”‚ β”‚
β”‚ β”‚ └── search_knowledge_base() β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ IR Module (ir/) β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ retriever.py ─────▢ retrievers/dense.py (BGE-M3) β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ β–Ό β–Ό β”‚ β”‚
β”‚ β”‚ knowledge_base.py cache/dense_index/ β”‚ β”‚
β”‚ β”‚ β”‚ β”œβ”€β”€ faiss_index_bge-m3.bin β”‚ β”‚
β”‚ β”‚ β–Ό └── chunk_metadata_bge-m3.json β”‚ β”‚
β”‚ β”‚ uae_knowledge_build/data/unified_KB/ β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ entities.json (5000+ entities) β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ alias_index.json β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ sensitive_topics.json β”‚ β”‚
β”‚ β”‚ └── category_metadata.json β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## Project Structure
```
hf_uae_demo/
β”œβ”€β”€ app.py # Entry point (starts FastAPI server)
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This file
β”‚
β”œβ”€β”€ backend/ # FastAPI backend
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ api.py # API endpoints & version info
β”‚ └── services.py # Retriever initialization
β”‚
β”œβ”€β”€ frontend/ # Static frontend (served by FastAPI)
β”‚ β”œβ”€β”€ index.html # Main HTML (version in help modal)
β”‚ β”œβ”€β”€ css/styles.css # Styles
β”‚ β”œβ”€β”€ js/app.js # JavaScript (TRANSLATIONS object)
β”‚ └── assets/ # Images (falcon.png, background.jpg)
β”‚
β”œβ”€β”€ ir/ # Information Retrieval module
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ retriever.py # Main retriever interface
β”‚ β”œβ”€β”€ knowledge_base.py # KB loader
β”‚ β”œβ”€β”€ models.py # Data models
β”‚ β”œβ”€β”€ normalizer.py # Text normalization
β”‚ β”œβ”€β”€ sensitive_detector.py # Sensitivity detection
β”‚ β”œβ”€β”€ sheets_storage.py # Google Sheets feedback storage
β”‚ β”œβ”€β”€ retrievers/ # Retriever implementations
β”‚ β”‚ β”œβ”€β”€ dense.py # BGE-M3 dense retrieval (Level 4)
β”‚ β”‚ β”œβ”€β”€ bm25.py # BM25 keyword retrieval
β”‚ β”‚ β”œβ”€β”€ alias.py # Alias matching
β”‚ β”‚ └── hybrid.py # Hybrid retrieval
β”‚ └── cache/dense_index/ # FAISS index cache
β”‚ β”œβ”€β”€ faiss_index_bge-m3.bin
β”‚ └── chunk_metadata_bge-m3.json
β”‚
β”œβ”€β”€ uae_knowledge_build/data/unified_KB/ # Knowledge base
β”‚ β”œβ”€β”€ entities.json # Main entity data
β”‚ β”œβ”€β”€ alias_index.json # Entity aliases
β”‚ β”œβ”€β”€ sensitive_topics.json # Sensitivity info
β”‚ └── category_metadata.json # Category metadata
β”‚
└── data/ # User feedback storage
β”œβ”€β”€ feedback.json
β”œβ”€β”€ ratings.json
└── translations_cache.json
```
---
## Update Guide
### When updating Knowledge Base (unified_KB)
1. **Build new KB** in `libra_shield/uae_knowledge_build/`
2. **Copy KB files** to `hf_uae_demo/uae_knowledge_build/data/unified_KB/`
3. **Rebuild FAISS index** on GPU (Spartan HPC):
```bash
python -m ir.evaluate_dense --model bge-m3 --save-index ir/cache/dense_index --debug
```
4. **Copy index files** to `hf_uae_demo/ir/cache/dense_index/`
### When updating Version Info
**Files to update (in order of importance):**
| File | What to update |
|------|----------------|
| `frontend/js/app.js` | `TRANSLATIONS` object (EN/AR/CN): `helpDataText`, `helpIRText`, `helpVersion` |
| `frontend/index.html` | Help modal content, footer copyright |
| `backend/api.py` | FastAPI `version` parameter |
**Important**: `app.js` TRANSLATIONS override `index.html` content at runtime via `updateHelpModal()`. Always update `app.js` first!
### Version checklist
When releasing a new version, update these locations:
- [ ] `frontend/js/app.js` line 3: `Version: X.X.X`
- [ ] `frontend/js/app.js` TRANSLATIONS.en.helpVersion
- [ ] `frontend/js/app.js` TRANSLATIONS.en.helpDataText (Last updated date)
- [ ] `frontend/js/app.js` TRANSLATIONS.en.helpIRText (Performance metrics)
- [ ] `frontend/js/app.js` TRANSLATIONS.ar.helpVersion, helpDataText, helpIRText
- [ ] `frontend/js/app.js` TRANSLATIONS.cn.helpVersion, helpDataText, helpIRText
- [ ] `frontend/index.html` line 240: Version in help modal
- [ ] `frontend/index.html` line 250: Footer copyright year
- [ ] `backend/api.py` line 60: FastAPI version
---
## Local Development
```bash
# Activate conda environment
conda activate libra_shield
# Run the server
cd hf_uae_demo
python app.py
# Open in browser
open http://localhost:7860
```
---
## Deployment (HuggingFace Spaces)
```bash
cd hf_uae_demo
git add .
git commit -m "Update to vX.X.X"
git push
```
---
Powered by [LibrAI](https://www.librai.tech/)