Spaces:
Running
Running
A newer version of the Streamlit SDK is available: 1.58.0
metadata
title: Financial Intelligence RAG Pipeline
emoji: ๐
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.39.0
app_file: app.py
pinned: false
Financial Intelligence RAG Pipeline
A production-grade Retrieval-Augmented Generation (RAG) system over Apple SEC filings and Morningstar research reports.
What It Does
Ask natural language questions about Apple's financials and get answers grounded in source documents with full citations.
Example questions:
- What was Apple's total net sales in FY2024?
- What are Apple's main risk factors from the 2024 10-K?
- How did the Services segment perform compared to Products?
- What is Apple's gross margin trend over the last 3 years?
Architecture
SEC EDGAR (10-K, 10-Q, 8-K) + Morningstar PDFs
|
Docling Processing (HTML/PDF parser)
|
HybridChunker (tokenizer-aware segmentation)
|
all-MiniLM-L6-v2 Embeddings (384-dim)
|
Qdrant Cloud (1,234 vectors)
|
Two-Stage Retrieval:
Dense ANN (50 candidates) โ ms-marco Cross-Encoder Reranking (top 8)
|
Google Gemini 1.5 Flash (streaming generation)
|
4-Layer Guardrails (input / retrieval / output / compliance)
|
Streamlit Chat UI
Tech Stack
| Component | Technology |
|---|---|
| Document Processing | Docling |
| Embeddings | sentence-transformers/all-MiniLM-L6-v2 |
| Reranker | cross-encoder/ms-marco-MiniLM-L-6-v2 |
| NLI Verifier | cross-encoder/nli-deberta-v3-small |
| Vector Store | Qdrant Cloud |
| LLM | Google Gemini 2.5 Flash |
| RAG Chain | LangChain LCEL |
| UI | Streamlit |
| CI/CD | GitHub Actions โ Hugging Face Spaces |
Evaluation Results
| Metric | Score |
|---|---|
| Retrieval Hit Rate | 100% |
| Context Recall | 93.3% |
| Answer Relevancy | 75.5% |
| Faithfulness | 52.7% |
| Aggregate | 55.7% |
Local Setup
# 1. Clone and install
git clone <repo-url>
pip install -r requirements.txt
# 2. Configure environment
cp .env.example .env
# Fill in GOOGLE_API_KEY, QDRANT_URL, QDRANT_API_KEY
# 3. Run the app
streamlit run app.py
Environment Variables
| Variable | Required | Description |
|---|---|---|
GOOGLE_API_KEY |
Yes (for deployment) | Google AI Studio API key |
QDRANT_URL |
Yes (for deployment) | Qdrant Cloud cluster URL |
QDRANT_API_KEY |
Yes (for deployment) | Qdrant Cloud API key |
Without these set, the app falls back to local Ollama + ChromaDB for development.
Disclaimer
This application is for informational and educational purposes only. It does not constitute investment advice.