Spaces:
Sleeping
Sleeping
| title: Mshauri Fedha | |
| emoji: 🦁 | |
| colorFrom: green | |
| colorTo: gray | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # Mshauri Fedha 🇰🇪 | |
| **An AI-Powered Financial Research Assistant for the Kenyan Economy.** | |
| Mshauri Fedha ("Financial Advisor") is a Proof of Concept (PoC) AI agent designed to provide accurate, data-driven insights into the Kenyan financial landscape. Unlike standard chatbots that hallucinate numbers, Mshauri Fedha uses a **Dual-Brain Architecture** to separate exact statistical retrieval from qualitative analysis. | |
| ## The Architecture | |
| This project solves the "Broken Table Problem" and the "Hallucination Problem" by splitting the AI's cognition into two distinct systems managed by a Supervisor Agent: | |
| ### 1. The Left Brain (Structured / SQL) | |
| * **Role:** The Mathematician. | |
| * **Source:** A SQLite database (`mshauri_fedha_v6.db`) containing rigorous data from **KNBS (Kenya National Bureau of Statistics)** and **CBK (Central Bank of Kenya)**. | |
| * **Capability:** Executes precise SQL queries to answer questions like *"What was the exact inflation rate in 2023?"* or *"Calculate the average tea export value for the last 5 years."* | |
| * **Tech:** LangChain SQL Toolkit. | |
| ### 2. The Right Brain (Unstructured / Vector RAG) | |
| * **Role:** The Analyst. | |
| * **Source:** A Vector Database (`ChromaDB`) containing: | |
| * **Official Reports:** Full Markdown conversions of KNBS Economic Surveys and CBK reports (tables preserved). | |
| * **Market News:** Cleaned and deduplicated business news articles (stripped of URLs and noise). | |
| * **Capability:** Performs Semantic Search to answer questions like *"Why did the shilling depreciate?"* or *"What is the sentiment regarding the new Finance Bill?"* | |
| * **Tech:** `nomic-embed-text` embeddings, ChromaDB. | |
| ### 3. The Supervisor (ReAct Agent) | |
| * **Role:** The Manager. | |
| * **Logic:** A custom-built **Zero-Dependency ReAct Agent** (Python-based) that analyzes user intent and routes the query to the correct "Brain"—or both—to synthesize a comprehensive answer. | |
| --- | |
| ## Technology Stack | |
| * **LLM Inference:** [Ollama](https://ollama.com/) (Local) | |
| * **Model:** `qwen2.5:14b` (Chosen for high reasoning capability) | |
| * **Embeddings:** `nomic-embed-text` (High performance for long documents) | |
| * **Orchestration:** Python (Custom ReAct Implementation) & LangChain Community | |
| * **Vector Store:** ChromaDB | |
| * **Database:** SQLite | |
| --- | |
| ## Setup & Installation | |
| ### 1. Prerequisites | |
| * Python 3.10+ | |
| * [Ollama](https://ollama.com/) installed and running. | |
| ### 2. Install Dependencies | |
| ```bash | |
| pip install pandas langchain-ollama langchain-community langchain-chroma chromadb duckduckgo-search tqdm | |
| ``` | |
| ### 3. Pull Required Models | |
| Ensure your local Ollama instance has the "Brain" and "Eyes" installed: | |
| ```bash | |
| ollama pull qwen2.5:14b | |
| ollama pull nomic-embed-text | |
| ``` | |
| ### 4. Configuration | |
| Ensure your Ollama server is running. | |
| ## Data Ingestion (Building the Brains) | |
| Before the agent can work, you must populate its knowledge base | |
| ### Step 1: Ingest News (Right Brain) | |
| Parses CSVs and chunks text. | |
| ```bash | |
| python src/load/ingest_news.py | |
| ``` | |
| ### Step 2: Ingest Reports (Right Brain) | |
| Loads Markdown files (converted from PDFs via Marker), preserving table structures for the AI to read. | |
| ```bash | |
| python src/load/ingest_md.py | |
| ``` | |
| ### Step 3: Verify SQL (Left Brain) | |
| Ensure `mshauri_fedha_v6.db exists` | |
| ## Usage | |
| You can interact with Mshauri Fedha directly via the modular agent script or within a Jupyter Notebook. | |
| **Running via Python Script** | |
| ```python | |
| from mshauri_demo import create_mshauri_agent, ask_mshauri | |
| # Initialize the Supervisor | |
| agent = create_mshauri_agent() | |
| # Ask a Hybrid Question (Uses both SQL and Vector) | |
| ask_mshauri(agent, "What is the inflation rate in 2023 and why is it rising?") | |
| ``` | |
| **Sample Output** | |
| ```plaintext | |
| User: What is the inflation rate in 2023 and why is it rising? | |
| ---------------------------------------- | |
| Starting Agent Loop... | |
| Step 1: Thought: The user is asking for a specific number (inflation rate) and a reason (why). | |
| I should first check the SQL database for the exact rate, then check the reports for the reasons. | |
| Action: sql_db_query | |
| Action Input: SELECT rate FROM inflation WHERE year = 2023 | |
| Step 2: Observation: [(7.7,)] | |
| Thought: I have the rate (7.7%). Now I need the reasons. | |
| Action: search_financial_reports_and_news | |
| Action Input: reasons for high inflation 2023 Kenya | |
| Step 3: Observation: ...Reports mention high fuel prices, depreciation of the shilling... | |
| ... | |
| Mshauri: The inflation rate in 2023 was 7.7%. This rise was primarily driven by increased fuel costs and the depreciation of the Kenyan Shilling against major currencies, which raised the cost of imports [Source: KNBS/News]. | |
| ``` | |
| ## Future Work | |
| Chainlit UI: Deploy a chat interface for easier interaction. | |
| GraphRAG: Implement knowledge graphs to better link entities (e.g., specific politicians to policies). | |
| Live Scraping: Automate the news ingestion to run daily. | |
| ----- | |
| **Status**: Prototype (Proof of concept) | |
| **Author** Teofilo Ligawa | |