Financial_bot / README.md
Pushkya's picture
Upload 30 files
8299003 verified
|
Raw
History Blame Contribute Delete
2.67 kB

A newer version of the Streamlit SDK is available: 1.58.0

Upgrade
metadata
title: Financial Intelligence RAG Pipeline
emoji: ๐Ÿ“Š
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.39.0
app_file: app.py
pinned: false

Financial Intelligence RAG Pipeline

A production-grade Retrieval-Augmented Generation (RAG) system over Apple SEC filings and Morningstar research reports.

What It Does

Ask natural language questions about Apple's financials and get answers grounded in source documents with full citations.

Example questions:

  • What was Apple's total net sales in FY2024?
  • What are Apple's main risk factors from the 2024 10-K?
  • How did the Services segment perform compared to Products?
  • What is Apple's gross margin trend over the last 3 years?

Architecture

SEC EDGAR (10-K, 10-Q, 8-K) + Morningstar PDFs
          |
    Docling Processing (HTML/PDF parser)
          |
    HybridChunker (tokenizer-aware segmentation)
          |
    all-MiniLM-L6-v2 Embeddings (384-dim)
          |
    Qdrant Cloud (1,234 vectors)
          |
    Two-Stage Retrieval:
      Dense ANN (50 candidates) โ†’ ms-marco Cross-Encoder Reranking (top 8)
          |
    Google Gemini 1.5 Flash (streaming generation)
          |
    4-Layer Guardrails (input / retrieval / output / compliance)
          |
    Streamlit Chat UI

Tech Stack

Component Technology
Document Processing Docling
Embeddings sentence-transformers/all-MiniLM-L6-v2
Reranker cross-encoder/ms-marco-MiniLM-L-6-v2
NLI Verifier cross-encoder/nli-deberta-v3-small
Vector Store Qdrant Cloud
LLM Google Gemini 2.5 Flash
RAG Chain LangChain LCEL
UI Streamlit
CI/CD GitHub Actions โ†’ Hugging Face Spaces

Evaluation Results

Metric Score
Retrieval Hit Rate 100%
Context Recall 93.3%
Answer Relevancy 75.5%
Faithfulness 52.7%
Aggregate 55.7%

Local Setup

# 1. Clone and install
git clone <repo-url>
pip install -r requirements.txt

# 2. Configure environment
cp .env.example .env
# Fill in GOOGLE_API_KEY, QDRANT_URL, QDRANT_API_KEY

# 3. Run the app
streamlit run app.py

Environment Variables

Variable Required Description
GOOGLE_API_KEY Yes (for deployment) Google AI Studio API key
QDRANT_URL Yes (for deployment) Qdrant Cloud cluster URL
QDRANT_API_KEY Yes (for deployment) Qdrant Cloud API key

Without these set, the app falls back to local Ollama + ChromaDB for development.

Disclaimer

This application is for informational and educational purposes only. It does not constitute investment advice.