--- title: Financial Intelligence RAG Pipeline emoji: 📊 colorFrom: blue colorTo: green sdk: streamlit sdk_version: 1.39.0 app_file: app.py pinned: false --- # Financial Intelligence RAG Pipeline A production-grade Retrieval-Augmented Generation (RAG) system over Apple SEC filings and Morningstar research reports. ## What It Does Ask natural language questions about Apple's financials and get answers grounded in source documents with full citations. **Example questions:** - What was Apple's total net sales in FY2024? - What are Apple's main risk factors from the 2024 10-K? - How did the Services segment perform compared to Products? - What is Apple's gross margin trend over the last 3 years? ## Architecture ``` SEC EDGAR (10-K, 10-Q, 8-K) + Morningstar PDFs | Docling Processing (HTML/PDF parser) | HybridChunker (tokenizer-aware segmentation) | all-MiniLM-L6-v2 Embeddings (384-dim) | Qdrant Cloud (1,234 vectors) | Two-Stage Retrieval: Dense ANN (50 candidates) → ms-marco Cross-Encoder Reranking (top 8) | Google Gemini 1.5 Flash (streaming generation) | 4-Layer Guardrails (input / retrieval / output / compliance) | Streamlit Chat UI ``` ## Tech Stack | Component | Technology | |-----------|------------| | Document Processing | Docling | | Embeddings | sentence-transformers/all-MiniLM-L6-v2 | | Reranker | cross-encoder/ms-marco-MiniLM-L-6-v2 | | NLI Verifier | cross-encoder/nli-deberta-v3-small | | Vector Store | Qdrant Cloud | | LLM | Google Gemini 2.5 Flash | | RAG Chain | LangChain LCEL | | UI | Streamlit | | CI/CD | GitHub Actions → Hugging Face Spaces | ## Evaluation Results | Metric | Score | |--------|-------| | Retrieval Hit Rate | 100% | | Context Recall | 93.3% | | Answer Relevancy | 75.5% | | Faithfulness | 52.7% | | Aggregate | 55.7% | ## Local Setup ```bash # 1. Clone and install git clone pip install -r requirements.txt # 2. Configure environment cp .env.example .env # Fill in GOOGLE_API_KEY, QDRANT_URL, QDRANT_API_KEY # 3. Run the app streamlit run app.py ``` ## Environment Variables | Variable | Required | Description | |----------|----------|-------------| | `GOOGLE_API_KEY` | Yes (for deployment) | Google AI Studio API key | | `QDRANT_URL` | Yes (for deployment) | Qdrant Cloud cluster URL | | `QDRANT_API_KEY` | Yes (for deployment) | Qdrant Cloud API key | Without these set, the app falls back to local Ollama + ChromaDB for development. ## Disclaimer This application is for informational and educational purposes only. It does not constitute investment advice.