FinSightAI / README.md
Aniket2003333333's picture
Update README.md
942389c verified
|
Raw
History Blame Contribute Delete
6.47 kB
---
title: FinSight AI
emoji: 📊
colorFrom: blue
colorTo: green
sdk: gradio
app_file: app.py
python_version: "3.11"
pinned: false
tags:
- track:backyard
- sponsor:openbmb
- sponsor:modal
- achievement:offgrid
---
# FinSight AI
Finance-domain **Retrieval-Augmented Generation (RAG)** assistant built with **OpenBMB MiniCPM** models. Upload earnings reports, bank statements, and filings — then chat, summarize, run OCR, and extract entities with cited answers.
Inference runs on **Modal** serverless GPUs; the Gradio UI, FAISS vector index, and document store stay local (or on Hugging Face Spaces). No 32B+ models — everything fits comfortably under the Build Small / SLM hackathon limits.
---
## What it does
| Tab | Description |
|-----|-------------|
| **Finance QA Chatbot** | Streaming RAG chat with source citations and confidence |
| **Financial Summary** | Executive, financial, or risk-focused summaries |
| **Document OCR** | Structured OCR for scanned PDFs and images |
| **Entity Extraction** | Companies, tickers, dates, and key figures |
| **Upload Documents** | Ingest, list, delete, and scope search to one file |
Search modes: **Hybrid RAG** (semantic + BM25 across all docs) or **Single Document** (chat scoped to one upload).
---
## Architecture
| Component | Model | Where it runs | VRAM |
|-----------|-------|---------------|------|
| **Embeddings** | MiniCPM-Embedding (4-bit NF4) | Modal T4 | ~1.6 GB |
| **LLM** | MiniCPM4.1-8B (Q4_K_M GGUF) | Modal T4 | ~5 GB |
| **OCR / Vision** | MiniCPM-V 4.6 | Modal A10G | ~2 GB |
| **Vector search** | FAISS + BM25 hybrid | Local / HF Space | CPU |
| **UI** | Gradio 6 | `:7860` | CPU |
| **REST API** *(optional)* | FastAPI | `:8000` | CPU |
Models download automatically on first Modal cold start into a persistent volume (`finsight-hf-cache`).
---
## Quick Start
### 1. Deploy Modal workers (one-time)
```bash
pip install modal
modal setup
modal deploy finsight_modal/app.py
```
Smoke test:
```bash
modal run finsight_modal/app.py
```
View deployment: [modal.com/apps](https://modal.com/apps) → **finsight-ai**
### 2. Run locally
```bash
cp .env.example .env
python -m venv .venv
.\.venv\Scripts\Activate.ps1 # Windows
# source .venv/bin/activate # macOS / Linux
pip install -r requirements.txt -r backend/requirements.txt
python app.py
```
Open **http://localhost:7860**
Optional REST API:
```bash
cd backend && uvicorn main:app --reload --port 8000
```
Docker:
```bash
docker compose up gradio -d
# optional API:
docker compose up backend -d
```
---
## Hugging Face Spaces
The Space entry point is `app.py` at the repo root (Gradio SDK).
Add these **Secrets** in Space settings:
| Secret | Description |
|--------|-------------|
| `MODAL_TOKEN_ID` | From `~/.modal.toml` after `modal setup` (starts with `ak-`) |
| `MODAL_TOKEN_SECRET` | Paired secret (starts with `as-`) |
| `MODAL_APP_NAME` | `finsight-ai` (must match deployed Modal app) |
Get tokens locally:
```powershell
# Windows
Get-Content $env:USERPROFILE\.modal.toml
```
Or create new tokens at [modal.com/settings](https://modal.com/settings).
> **Note:** FAISS indexes and uploaded documents persist under `./data/` locally. On HF Spaces, storage is ephemeral unless you attach a persistent volume — re-upload docs after restarts.
---
## Modal credentials (Docker / CI)
After `modal setup`, credentials live in `~/.modal.toml`:
```toml
[default]
token_id = "ak-..."
token_secret = "as-..."
```
Set as environment variables (overrides the file):
```bash
export MODAL_TOKEN_ID="ak-..."
export MODAL_TOKEN_SECRET="as-..."
export MODAL_APP_NAME="finsight-ai"
```
See [Modal token docs](https://modal.com/docs/reference/modal.config) for CI and Docker setup.
---
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MODAL_APP_NAME` | `finsight-ai` | Deployed Modal app name |
| `FAISS_DATA_DIR` | `./data/faiss` | FAISS index + chunk metadata |
| `CHAT_DB_PATH` | `./data/chat_sessions.db` | SQLite chat sessions |
| `TOP_K` | `6` | Retrieved chunks per query |
| `CHUNK_SIZE` | `512` | Ingestion chunk size (tokens) |
| `CHUNK_OVERLAP` | `64` | Chunk overlap |
| `HYBRID_ALPHA` | `0.6` | Semantic vs BM25 blend (0–1) |
---
## Model Summary
| Model | Size | Quantization | Source |
|-------|------|--------------|--------|
| MiniCPM-Embedding | 0.4B | 4-bit NF4 (BnB) | [openbmb/MiniCPM-Embedding](https://huggingface.co/openbmb/MiniCPM-Embedding) |
| MiniCPM4.1-8B | 8B | Q4_K_M GGUF | [openbmb/MiniCPM4.1-8B](https://huggingface.co/openbmb/MiniCPM4.1-8B) |
| MiniCPM-V 4.6 | 1B | fp16 | [openbmb/MiniCPM-V-4.6](https://huggingface.co/openbmb/MiniCPM-V-4.6) |
All OpenBMB models: **Apache 2.0** · Hugging Face Hub
Total stack stays well below the **32B Build Small** parameter limit.
---
## REST API *(optional)*
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/chat` | POST | SSE streaming RAG chat |
| `/api/documents/upload` | POST | Upload PDF / image |
| `/api/documents/list` | GET | List ingested documents |
| `/api/summarize` | POST | Financial summary |
| `/api/ocr` | POST | OCR extraction |
| `/api/extract-entities` | POST | Entity extraction |
| `/api/sessions` | GET / POST | Chat session management |
---
## Repository Structure
```text
app.py # HF Space entry (Gradio)
backend/
gradio_ui/ # Tabs, theme, custom CSS
services/ # RAG, ingestion, summarizer
models/ # Modal client wrappers
db/ # FAISS + SQLite
routers/ # FastAPI routes
finsight_modal/
app.py # Modal GPU workers (deploy separately)
data/ # FAISS index + uploads (gitignored)
requirements.txt
docker-compose.yml
```
---
## Hackathon Context
Built for the **Hugging Face Build Small Hackathon** and the **SLM Hackathon** track (Project 09 — FinSight Statement Auditor lineage). Uses efficient OpenBMB models with Modal offload so the UI runs on CPU while GPUs spin up only for inference.
| Badge | How FinSight qualifies |
|-------|------------------------|
| **Build Small** | All models combined ≪ 32B params |
| **Off the Grid** | Document index + FAISS stay on-device; only inference hits Modal |
| **Off-Brand** | Custom FinSight Gradio theme (gold accent, finance-first layout) |
---
## License
Apache-2.0 (application code and OpenBMB model weights)