mshauri-fedha / README.md
teofizzy's picture
Clean up user prompt formatting in README
a599ec9 unverified
metadata
title: Mshauri Fedha
emoji: 🦁
colorFrom: green
colorTo: gray
sdk: docker
app_port: 7860
pinned: false

Mshauri Fedha 🇰🇪

An AI-Powered Financial Research Assistant for the Kenyan Economy.

Mshauri Fedha ("Financial Advisor") is a Proof of Concept (PoC) AI agent designed to provide accurate, data-driven insights into the Kenyan financial landscape. Unlike standard chatbots that hallucinate numbers, Mshauri Fedha uses a Dual-Brain Architecture to separate exact statistical retrieval from qualitative analysis.

The Architecture

This project solves the "Broken Table Problem" and the "Hallucination Problem" by splitting the AI's cognition into two distinct systems managed by a Supervisor Agent:

1. The Left Brain (Structured / SQL)

  • Role: The Mathematician.
  • Source: A SQLite database (mshauri_fedha_v6.db) containing rigorous data from KNBS (Kenya National Bureau of Statistics) and CBK (Central Bank of Kenya).
  • Capability: Executes precise SQL queries to answer questions like "What was the exact inflation rate in 2023?" or "Calculate the average tea export value for the last 5 years."
  • Tech: LangChain SQL Toolkit.

2. The Right Brain (Unstructured / Vector RAG)

  • Role: The Analyst.
  • Source: A Vector Database (ChromaDB) containing:
    • Official Reports: Full Markdown conversions of KNBS Economic Surveys and CBK reports (tables preserved).
    • Market News: Cleaned and deduplicated business news articles (stripped of URLs and noise).
  • Capability: Performs Semantic Search to answer questions like "Why did the shilling depreciate?" or "What is the sentiment regarding the new Finance Bill?"
  • Tech: nomic-embed-text embeddings, ChromaDB.

3. The Supervisor (ReAct Agent)

  • Role: The Manager.
  • Logic: A custom-built Zero-Dependency ReAct Agent (Python-based) that analyzes user intent and routes the query to the correct "Brain"—or both—to synthesize a comprehensive answer.

Technology Stack

  • LLM Inference: Ollama (Local)
  • Model: qwen2.5:14b (Chosen for high reasoning capability)
  • Embeddings: nomic-embed-text (High performance for long documents)
  • Orchestration: Python (Custom ReAct Implementation) & LangChain Community
  • Vector Store: ChromaDB
  • Database: SQLite

Setup & Installation

1. Prerequisites

  • Python 3.10+
  • Ollama installed and running.

2. Install Dependencies

pip install pandas langchain-ollama langchain-community langchain-chroma chromadb duckduckgo-search tqdm

3. Pull Required Models

Ensure your local Ollama instance has the "Brain" and "Eyes" installed:

ollama pull qwen2.5:14b
ollama pull nomic-embed-text

4. Configuration

Ensure your Ollama server is running.

Data Ingestion (Building the Brains)

Before the agent can work, you must populate its knowledge base

Step 1: Ingest News (Right Brain)

Parses CSVs and chunks text.

python src/load/ingest_news.py

Step 2: Ingest Reports (Right Brain)

Loads Markdown files (converted from PDFs via Marker), preserving table structures for the AI to read.

python src/load/ingest_md.py

Step 3: Verify SQL (Left Brain)

Ensure mshauri_fedha_v6.db exists

Usage

You can interact with Mshauri Fedha directly via the modular agent script or within a Jupyter Notebook.

Running via Python Script

from mshauri_demo import create_mshauri_agent, ask_mshauri

# Initialize the Supervisor
agent = create_mshauri_agent()

# Ask a Hybrid Question (Uses both SQL and Vector)
ask_mshauri(agent, "What is the inflation rate in 2023 and why is it rising?")

Sample Output ```plaintext User: What is the inflation rate in 2023 and why is it rising?

Starting Agent Loop...

Step 1: Thought: The user is asking for a specific number (inflation rate) and a reason (why). I should first check the SQL database for the exact rate, then check the reports for the reasons. Action: sql_db_query Action Input: SELECT rate FROM inflation WHERE year = 2023

Step 2: Observation: [(7.7,)] Thought: I have the rate (7.7%). Now I need the reasons. Action: search_financial_reports_and_news Action Input: reasons for high inflation 2023 Kenya

Step 3: Observation: ...Reports mention high fuel prices, depreciation of the shilling... ... Mshauri: The inflation rate in 2023 was 7.7%. This rise was primarily driven by increased fuel costs and the depreciation of the Kenyan Shilling against major currencies, which raised the cost of imports [Source: KNBS/News].

## Future Work
Chainlit UI: Deploy a chat interface for easier interaction.

GraphRAG: Implement knowledge graphs to better link entities (e.g., specific politicians to policies).

Live Scraping: Automate the news ingestion to run daily.

-----
**Status**: Prototype (Proof of concept)
**Author** Teofilo Ligawa